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FEATURES 
32-bit internal architecture 


32-bit external data bus 


64M-byte linear program address 
space 


4G-byte linear data address space 


Bus timing optimized for standard 
DRAM usage with page mode 
operation 


64+M-byte/second bus bandwidth 


Simple/powerful instruction set 
providing an excellent high level 
language compiler target 


Hardware support for virtual memory 
systems 


Low interrupt latency for real-time 
application requirements 


Full CMOS implementation results in 
low power consumption 


Single 5 V + 5% operation 


100-pin ceramic pin grid array 

(CPGA), quad plastic flatpack 

(QPFP), or MegaCell format in 
Apple VLSI Library 


PIN DIAGRAM 
CERAMIC PIN GRID ARRAY 


© 
OFS) 
© 
0) 


= 
wo 


(0) 


H| © © " © © © 
A20 A22 A2!1 APRM 021 020 022 
6s} O@ ®@ @ BOTTOM VIEW ®® ® 
A23 A246 A2S 025 024 023 
Fi © © © © © © 
A268 A27 027 028 
E1@® © © © 
A268 A29 p28 
OM O) © © 
A30 ASI M1 CLK JRO CPA 090 
C1 @©@ ©Oxussiow © © © © 
VSSP VDDC —Ss PAGEO SEQ CPB VSSO FIRQ MREQ AWW DBE VDDP 031 
8B; @ © © © © 
VSSC BGHITPAGE1 ALE MO TRAN VDDO RES ABORTOPC BMW CPI VSSP 
IC ® © © ® © © © © © © 


1 2 3 4 5 6 7 8 


32-BIT RISC MICROPROCESSOR 


DESCRIPTION 

The VL2340 Apple Proprietary RISC 
Machine (APRM) is a full 32-bit general- 
purpose microprocessor designed using 
reduced instruction set computer 
(RISC) methodologies. The APRM is an 
Apple Computer, Inc. owned version of 
the powerful Acorn RISC Machine 
(ARM). Applications in which the proc- 
essor is useful include laser printers, 
graphics engines, and any other 
systems requiring fast real-time 
response to external interrupt sources 
and high processing throughput. 


The APRM features a 32-bit data bus, 
27 registers of 32 bits each, a load- 
store architecture, a partially overlap- 
ping register set, 1.5 1s worst-case 
interrupt latency (at 16MHz operation), 
conditional instruction execution, a 26- 
bit linear program address space, a 32- 
bit linear data address space, and an 
average instruction execution rate of 
from 12-to-14 million instructions per 
second (MIPS at 16MHz). Additionally, 
the processor supports two addressing 
modes: program counter (PC) and base 
register relative modes. The ability to do 
pre- and post-indexing allows stacks 
and queues to be easily implemented in 
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software. All instructions are 32 bits 
long (aligned on word boundaries), with 
register-to-register operations executing 
in one cycle. The two data types 
directly supported are 8-bit bytes and 
32-bit words, with efficient compiler 
generated support for 16-bit values. 
The APRM includes support for un- 
aligned access to 32-bit data words. 


Using a load-store architecture simpli- 
fies the execution unit of the processor, 
since only a few instructions deal 
directly with memory and the rest 
operate register-to-register. Load and 
store multiple register instructions 
provide enhanced performance, making 
context switches faster and exploiting 
sequential memory access modes. 


The processsor supports two types of 
interrupts that differ in priority and 
register usage. The lowest latency is 
provided by the fast interrupt request 
(FIRQ) which is used primarily for /O to 
peripheral devices. The other interrupt 
type (IRQ) is used for interrupt routines 
that do not demand low-latency service 
or where the overhead of a full context 
switch is small compared with the 
interrupt process execution time. 


Package 
Ceramic Pin 
Grid Array (CPGA) 


Plastic Quad 
Flatpack (PQFP) 


see tCK Min. 


Note: Operating temperature range is 0°C to +70°C. 
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SIGNAL DESCRIPTIONS 


Signal 
Name 


Pin 
Number (1) 


Signal 
Description 


ABORT 


—M1, -MO 


A31 - AO 


: 


See Package 


B11 


Ai1 


See Package 


B10 


Processor Clock Input - This input provides the clock to the circuit. The 2 internal clock is in 
phase with this input and the ©1 internal clock is the nonoverlapping inverse. 


Interrupt Request Input - This is the normal interrupt request pin. It may be asserted asyn- 
chronously to cause the processor to be interrupted. It is active low. 


Fast Interrupt Request Input - This interrupt request line has a higher priority than IRQ, but 
otherwise is the same. It too is active low. 


Reset Input - This is the reset signal for the processor. While active, the processor executes 
no-ops until the signal goes inactive from which point execution starts at the Reset Vector 
location. This signal is active high. 


Abort Input - This signal can be used to abort the current bus cycle being executed by 
the processor. Typically, it is connected to a memory management unit to control accesses for 
protection. The abort signal is active-high. 


Data31- Data0 - This is the 32 bit bidirectional data bus used to transfer data to and from the 
memory. These lines are tri-state and active-high. 


Data Bus Enable Input - This is the asynchronous tri-state control signal for controlling 

the drivers of the data bus. When asserted the data bus is enabled. This signal is active 
high. 

Not Byte / Word Output - This “early warning” (note 2) signal indicates to the memory system 
that the current fetch is a byte fetch rather than a word fetch. It is asserted during the last 
portion of the cycle preceding the cycle that requires a byte fetch. When asserted (low) the 
memory system should deal with bytes. It is active-low. While RES is active -B/W will remain 
high. 

Mode 1,0 Outputs - These two signals are used to indicate the current operating mode of 

the processor. They can be used as address space modifiers to increase the address space, 
or to assist a memory management unit in offering various protection modes. The lines are 
active-low and the inverse of bits 1,0 of the processor status register. While RES is active MO 
and M1 retain their previous states. 


=—Mi-MQ MODE 
0 0 Supervisor 
0 1 FIRQ 
1 O IRQ 
1 1 USER 


Address 31 - Address 0 Outputs - These are the 31 address lines. AO and A1 are byte 
addresses and should be ignored during opcode fetech cycles. During opcode fetches, the 
current mode value may appear on these signals. The address lines are tri-state and active- 
high. 

Address Bus Enable Input - This is the asynchronous three-state control signal for 
controlling the drivers of the address bus. When asserted the address bus is enabled. The 
signal is active-high. | 


Address Latch Enable Input - This signal is used to control internal transparent latches on 
the address outputs. When ALE is high the address outputs change during ©2 to the value 
required for the next cycle. Direct interfacing to ROMs requires address lines to be stable 
until the end of 2. Holding ALE low until the end of ©2 will latch the address outputs for 
ROM cycles. Systems that do not directly interface to ROMs may tie ALE high. 


Not Read/Write Output - This is the read / write signal from the processor. When asserted 
(low), it indicates that the processor is performing a read operation. When negated (high), 
the processor is performing a write operation. This signal is an “early warning” (note 2) signal 
and is active low. While RES is active -R/W will remain low. 
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SIGNALDESCRIPTIONS CS 


Signal Pin Signal 
Name Number Description 
—MREQ B9 Next Memory Cycle Start Output - This is an “early warning” (note 2) indicator that is asserted 


before the processor will start a memory cycle during the next clock phase. This signal is 
active low. While RES is active -MREQ will remain low. 


-TRAN A6 Translate Enable Output - This signal, when asserted by the processor tells a memory 
management unit that translation should be done on the current address. When negated, it 
indicates that the address should pass through untranslated. This signal is active-low. 


—OPC A10 Instruction Fetch Output - This “early warning” (note 2) signal when asserted indicates that the 
current bus cycle is an instruction fetch. This signal is active-low. While RES is active -OPC 
will remain low. 


SEQ BS Next Address Sequential Output - This “early warning” (note 2) signal is asserted when the 
processor will generate a sequential address during the next memory cycle. It may be used to 
control fast memory access modes. This signal is active-high. While RES is active SEQ will 
remain high. 


—CPI A12 Coprocessor Instruction (CMOS level output) - When the APRM executes a coprocessor in- 
struction, this output is driven low and the processor will wait for a responsefrom an attached 
coprocessor device. The action taken is dependent upon the coprocessor response signalled 
on the CPA and CPB inputs. 


CPB B6 Coprocessor Busy (TTL level input) - An attached coprocessor that is capable of performing 
the operation which the APRM is requesting (by asserting the —CPI!), but cannot begin imme- 
diately, should indicate the busy condition by driving this signal high. When the coprocessor 
is ready to start it should bring the CPB signal low. The APRM samples this signal on the ( 
falling edge of the 21 clock while the —CPI is active (low). 


CPA C12 Coprocessor Absent (TTL level input) - A coprocessor capable of executing the operation cur- 
rently requested by the APRM (-—CPI active) should bring the CPA low immediately. If the 
CPA is high on the falling edge of the @1 clock, the processor will abort the coprocessor 
handshake and take the undefined instruction trap. If the CPA is low and remains low during 
the —CPI active time, then the VL86C010 will busy-wait until the CPB signal becomes low and 
complete the coprocessor instruction. 


NADR L7 Next address (CMOS level input) - When asserted selects the current address plus four for 
non-aligned memory reads. Also, the appropriate data bytes are latched from the first word 
from memory. This signal is active high. 


PAGE1, PAGEO A3, B4 Page size (CMOS level inputs) - These two signals are decoded to determine the DRAM page 
size of the memories used. 


Pi PQ  PageSize 
0 0 


256 words 
0 1 512 words 
1 O 1024 words 
1 1 2048 words 
IMSBLOW B3 Most Significant byte low (CMOS input) - When asserted this input forces the upper eight 
address outputs low. This input is active high. 
PGHIT A2 Page hit (CMOS level output) - This early warning signal indicates that the current memory 


operation is the last address in the active page. This output is active low. 
NOTES: 


1. Pin numbers are for ceramic pin grid array package only. 
2. "Early warning” signals are asserted during the last portion of the cycle preceding the cycle to which they apply. 
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FUNCTIONAL DESCRIPTION 
The philosophy of RISC processor 
design is based on the idea that some 
processing functions can be moved 
from hardware to software with the 
result that the simplified hardware can 
actually execute functions in software 
faster than with complicated hardware. 
Analysis done several years ago at 
major research centers has shown that 
a processor and compiler combination 
can replace the traditional processor- 
alone architectures. An historical fact of 
the 16-bit processor world is that after 
chip designers spent many man-months 
figuring out how to implement univer- 
Sally acceptable complicated instruc- 
tions to do things, few compiler writers 
actually took advantage of these 
complex instructions. Most compilers 
only use a fraction of the instructions 
and addressing modes of traditional 
computer architectures. 


The user pays for the unused silicon 
required to implement these instruc- 
tions. He pays for the inefficient 
utilization in both cost of the processor 
and in lower performance. The silicon 
spent for complex instruction decoding 
and micro-sequencing could have been 
used for additional pipelining, larger 
register sets, or other special-purpose 
hardware that can be used efficiently. If 
the addition of a new instruction causes 
all instructions to execute 10% slower 
due to internal processor delays, then 
the new instruction had better be used 
more than 10% of the time otherwise 
overall performance has been sacri- 
ficed. This makes an argument for 
simple performance oriented architec- 
tures that are more dependent on 
compiler technology to implement less 
frequently used instructions. 


COMPARISON OF PROCESSORS 
Inherent in the concept of RISC proces- 
sors is the notion that more instructions 
are required to implement the same 
functions that could be done by fewer 
instructions with a complex instruction 
set computer (CISC) processor. In 
most cases even when more instruc- 
tions are needed by RISC processors, 
the function can still be performed 
quicker on RISC processors than CISC 
processors. This is causing the industry 
to doubt the Million Instruction Per 
Second (MIPS) ratings of RISCproces- 
sors, for good reason. The term MIPS 
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is often used exclusively as a means of 
benchmarking performance. A better 
measure of performance is to time 
actual execution of real-world problems, 
independent of the number of instruc- 
tions required to implement the 
function. 


Benchmarks using compiled QuickDraw 
routines approximate real conditions. 
Measurements are based on pixel per 
second generation, bit-field extraction 
rates, etc. Running well below its 
maxium specified clock rates, the . 
APRM, running compiled code, will 
outperform all popular, commercially 
available microprocessors running hand 
crafted code. 


An important parameter to keep 
constant when benchmarking proces- 
sors is the memory access times, since 
not all processors will meet perform- 
ance claims when working with com- 
modity memories. 


Another traditional measure of perform- 
ance in the microprocessor world is the 
clock frequency of the processor. 
Faster is better has been the rule of 
thumb, but what is actually the most 
important consideration is the average 
number of bus cycles per instruction. A 
processor with a low clock frequency 
and a low number of bus cycles per 


instruction can actually outperform a 
processor with a high clock frequency 
and a higher number of bus clock 
cycles per instruction. The best choice 
of processors is a one that benchmarks 
high while using a relatively low clock 
frequency and a small number of clocks 
per instruction executed. The APRM 
possesses these characteristics, giving 
it the best future evolution path to 
exploit advances in process technology. 


PROGRAMMING MODEL 

The APRM contains a large, partially 
overlapping set of twenty-seven 32-bit 
registers, although the programmer can 
access only sixteen registers in any 
mode of operation. Fifteen of the 
registers are general purpose; with the 
remaining twelve dedicated to functions 
such as User Mode, FIRQ Mode, IRQ 
Mode, Supervisor mode and the 
Program Counter(PC) / Processor 
Status Register(PSR). Figure 1 shows 
the register model of the APRM. Regis- 
ters RO-to-R13 are accessible from the 
user mode for any purpose. The 
fourteenth register, user-mode return- 
link register, is specific to the user 
mode. Its contents are mapped with 
those of other return-link registers as 
the mode is changed. The return-link 
register is used by the Branch-and-Link 
instruction in a procedure call sequence 
but may be used as a general-purpose 


FIGURE 1. VL2340 REGISTER MODEL 
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register at other times. The least 
significant two bits of the processor 
status word (PSW) define the current 
mode of operation. 


Seven registers are dedicated to the 
FIRQ mode and overlie user-mode 
registers R8-to-R14 when the fast 
interrupt request is serviced. The 
registers R8 FIRQ-to-R13 FIRQ are 
local to the fast interrupt service routine 
and are used instead of the user-mode 
registers R10-R13. Register R14 FIRQ 
holds the address used to restart the 
interrupted program instead of pushing 
it onto a stack at the expense of another 
memory cycle. Using a link-register 
helps provide very fast servicing of /O 
related interrupts without disturbing the 
contents of the general-purpose register 
set although the FIRQ routine can 
access the RO-to-R9 user-mode 
registers if desired. The FIRQ mode is 
used typically for very short interrupt 
service routines that might fetch and 
store characters in a disk-or-tape- 
controller application. 


The next two registers are dedicated to 
the IRQ mode and overlie user mode 
registers R13 and R14 when the IRQ is 
serviced. Once again R14 IRQ is the 
return link register that holds the restart 
address and R13 IRQ is general- 
purpose and dedicated to the IRQ 
mode. This mode is used when the 
interrupt service routine will be lengthy 
and the overhead of saving and 
reloading the register set will not be a 
significant portion of the overall execu- 
tion time. 


Two registers are dedicated to the 
supervisor mode and overlay user mode 
registers R13 and R14 when a supervi- 
sor mode switch is made using a 
software interrupt (SWI) instruction. 
Operation of these two registers is the 
same as previously discussed. 


The last register (R15) contains the 
processor status word and program 
counter and is shared by all modes of 
operation. The upper six bits are 
processor status, the next 24 bits are 
the program counter (word address), 
and the last two indicate the mode. 


PROCESSOR STATUS REGISTER 
Like most 32-bit processors, the APRM 
makes a distinction between user and 
supervisor modes: the user executes at 
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FIGURE 2. PROCESSOR STATUS REGISTER 
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the lowest privilege level, and the 
supervisor and interrupts execute at 
higher levels of privilege. Figure 2 
shows the processor status word 
containing the control line states 
associated with each mode. 


Translate is a control signal provided by 
the processor for control of an external 
memory management unit. The 
translate line is enabled in the user 
mode and disabled in the supervisor, 
fast interrupt and normal interrupt 
modes, since ail modes except for the 
user mode are expected to be running 
secure code. Translated fetches can be 
made from the supervisor mode by 
setting an optional bit in the load / store 
instructions. 


The processor status register (PSR) 
contains the program counter, mode 
control bits, and condition codes as 
shown in Figure 2. The bits marked 
with an asterisk are alterable only from 
non-user modes. If the user tries to 
write to these bits, they remain un- 
changed and the processor continues 
operation in the user mode. In other 
words, this is not a trap condition. The 
flags in the processor status register are 
the standard Negative, Zero, Carry, and 
Overflow. The sixteen allowable 
combinations of the condition code bits 
are shown in Table 1. These combina- 
tions are used for all conditional 
instruction execution since a conditional 
branch is nothing more than a jump | 
instruction with conditional execution. 


EXCEPTIONS 

The APRM supports a partially overlap- 
ping register set so that when interrupts 
are taken, the contents of the register 
array do not have to be saved before 
new operations can begin. Improved 
response time is accomplished, in the 
case of the fast interrupt, by dedicating 
six general-purpose registers, in 


6 


* - ACCESS FROM NON-USER 
MODES ONLY 


addition to a return-link register, that are 
only accessible in the FIRQ mode. 
These dedicated registers can contain 
all the pointers and byte-counts for 
simple VO service routines thus 
incurring no overhead when context 
switching between processing and 
servicing interrupts at high rates. The 
other modes (IRQ and SUP) each have 
one general-purpose and one return 
address (link) register dedicated to 
them. The general-purpose register is 
ideally suited for implementing a local 
stack for each mode. The need for 
dedicated registers in these modes is 
not as great since the time spent in an 
interrupt or supervisor routine is on the 
average much greater than the time 
spent in transition between the routines. 
The working registers can be saved and 
restored from stacks without significant 
overhead. 


The interrupt latency of the APRM is 
very short because the instruction 
execution time is typically two clocks, 
with a maximum of eighteeen (for a 
load-multiple instruction, loading sixteen 
registers). Once the processor recog- 
nizes an interrupt is pending, the time to 
begin processing is four clocks making 
a total worst-case interrupt latency of 
22.5 clocks. 


In addition to interrupts, five other types 
of exceptions are supported by the 
processor. These are data-fetch cycle 
aborts, instruction-fetch cycle aborts, 
software interrupts, undefined instruc- 
tion traps and reset. 


The APRM supports a 32-bit linear 
address space allowing a total of 4G- 
bytes of physical memory. The total 
program space is limited to 26-bits of 


address space, for a total of 64M-bytes — 


( 


used by program execution. 
If the abort signal is asserted by the 
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TABLE 1. INSTRUCTION CONDITION CODES 


Condition Value Operation 

AL Ye | Always 

oe 

ee ee 

EQ Fg | Equal ( Z Set ) 
coe | A 

GT 

a 

| 

is | 

a 

Mi ae Negative ( N ) 
NE a Not Equal (—Z ) 
NV Dog | Never 

PL ae Positive (-N ) 
VC Overflow Clear 
VS _ Fg Overflow Set 


memory management unit during a data 
fetch the processor will abort data 
transfer instructions (LDR, STR) as ff 
they had never been executed. If the 
instruction was a block data transfer 
(LDM, STM) the processor will allow the 
instructions to complete. If the write- 
back control bit in these instructions is 
set, the base address will be updated 
even if it would have been overwritten 
during the instruction execution. An 
example of this would be execution of a 
block data transfer instruction with the 
base register in the list of registers to be 
overwritten. 


Software interrupt instructions are used 
to change from user mode to supervisor 
mode. When an SWI is encountered 
the processor will save the current 
program counter (R15) into R14 SUP, 
set the mode bits to the supervisor 
mode, and start execution at the 
software interrupt vector address. An 
undefined instruction will cause a trap 
similar to the execution of a software 
interrupt except that the Undefined 
Instruction Vector will be used as a the 
next address. Reset is treated similarly 
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Carry Clear/Unsigned Lower Than 
Carry Set/Unsigned Higher Or Same 


Greater Than Or Equal (N* V)+(—-N*-V) 
Greater ((( N * V) + (— N«—V )) « —2) 

Higher Unsigned ( C + -Z ) 

Less Than Or Equal ((( N° —-V ) + (-N°V))+Z) 
Lower Or Same Unsigned (-C + Z ) 

Less Than ((N*—-V + (-N°V)) 


to the other traps and will start the 
processor from a known address. When 
the reset condition is recognized the 
currently executing instruction will 
terminate abnormally, the processor will 
enter the supervisor mode, disable both 
the FIRQ and IRQ interrupts, and begin 
execution at address 0000H. While the 
reset condition remains the processor 
will execute dummy instruction fetches. 


The processor exception vector map is 
illustrated in Table 2. The exceptions 
are prioritized reset (highest), address 
exception, data abort, FIRQ, IRQ, 
prefetch abort, undefined instruction, 
and software interrupt (lowest). These 
vector addresses normally will contain a 
branch instruction to the associated 
service routine except for the FIRQ 
entry. In order to further reduce 
latency, the FIRQ service routine may 
begin at address 001CH ff the software 
designer so chooses. 


Whenever the processor enters the 
supervisor mode, whether from an SWI, 
undefined instruction trap, prefetch or 
data abort, the IRQ is disabled and the 
FIRQ unchanged. 


INSTRUCTION SET 

The APRM supports five basic types of 
instructions, with several options 
available to the programmer. These 
instruction types are: data processing , 
data transfer, block data transfer, 
branch, and software interrupt. All 
instructions contain a 4-bit conditional 
execution field (shown in Table 1) that 
can cause an instruction to be skipped if 
the condition specified is not true. The 
execution time for a skipped instruction 
is one sequential cycle (100 ns at 10 
MHz). 


Data processing instructions operate 
only on the internal register file, and 
each has three operand references: a 
destination and two source fields. The 
destination (Rd) can be any of the 
registers including the processor status 
register, although some bits in R15 can 
only be changed in particular modes. 
The source operands can have two 


TABLE 2. EXCEPTION VECTOR MAP 


Address (Hex) | Function 


Priority Level 
0 


GW j— 


000 0000 | 

000 0004 5 
000 0008 6 
000 0006 ‘ 
000 0010 

000 0018 

000 0016 
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TABLE 3. DATA PROCESSING INSTRUCTIONS 


R 


Compare Negative Shift(S2)+Rn 


Muttiply with Accumulate =Rm * Rs + Rd 


R 
Rd:=Shift(S2) 
R 


Inclusive OR =«Rn+Shift(S2) 


#Shift(S2)-Rn-14+C 


d: 

d: 

d:=Rn@® Shift(S2) 
Rd: 

Rd: ° 
Rd:=Shi -Rn- 
Rd: ° 
Rd: 


d 
d 
d 
d 
d:=Shift(S2)-Rn 
d 
d 
d 


Subtract With Carry =Rn-Shift(S2)-1+C 


=Rn-Shitt(S2) 


Instruction ‘ar 
ADC 
ADD 

AND | And 
BIC 
CMN 

CMP Compare 

EOR 
MLA i 

MoV | Move 
MUL 
MVN Rd:= ~Shit(S2) 
ORR ; 

RSB 
RSC With Carry 

SBC 

SUB 

TEQ 

TST 


Test For Equality Rn & Shift(S2) 


TABLE 4 MEMORY ADDRESSING MODES 


Addressing Mode Operation Syntax 
PC Relative EA* = PC +/— Offset (12 Bits) | LABEL 
Base Register Offset EA* = Rn 

With Post-increment Rn +/— Offset —S> Rn [Rn], Off 
Base Register Offset EA* = Rn +/ Offset ( 12 Bits ) 

With Pre-Increment** Rn +/— Offset —S> Rn [Rn,O 
Base Register Index EA* = Rn 

With Post-increment Rn +/- Rm —@& Rn [Rn],Rm 
Base Register Index EA* = Rn +/— Rm 

With Pre-Increment** Rn +/—- Rm —& Rn [Rn,Rm) 


* Effective Address 


Flags 
Affected 


N,2Z,C,V 
N,2Z,C,V 
N,2Z,C 
N,2Z,C 
N,2Z,C, V 
N, Z,C, V 
N,Z,C 
N,2,0,V 
N,2Z,C 
N, Z,C, V 
N,2Z,C 
N,2Z,C 
N, ZC, V 


N,2Z,C,V 
N,Z,C,V 
N, Z,C, V 
N, Z,C 
N,Z,C 


** Program control of index register update; i.e., Rn may be left unchanged. 
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forms: both can be registers (Rm and 
Rn) or a register (Rn) and an 8-bit 
immediate value. Both forms of 
operand specification provide for the 
optional shifting of one of the source 
values using the on-board barrel shifter. 
lf both operands are registers, the Rm 
can be shifted. For the other case, it is 
the immediate value that can pass thru 
the shifter. Another field in these in- 
structions allows for the optional 
updating of the condition codes as a 
result of execution of the operation. 
Table 3 shows the possible data proc- 
essing operations and the status flags 
affected. 


Data transfer instructions are used to 
move data between memory and the 
register file (load), or vice-versa (store). 
The effective address is calculated 
using the contents of the source register 
(Rn) plus an offset of either a 12-bit 
immediate value or the contents of 
another register (Rm). When the offset 
is a register it can optionally be shifted 
before the address calculation is made. 
Table 4 shows the addressing modes 
supported and their corresponding 
assembler syntax. The offset may be 
added to, or subtracted from the index 
register Rn. Indexing can be either pre- 
or-post depending on the desired 
addressing mode. In the post-indexed 
mode the transfer is performed using 
the contents of the index register as the 
effective address and the index register 
is modified by the offset and rewritten. 
In the pre-indexed mode the effective 
address is the index register modified in 
the appropriate manner by the offset. 
The modified index register can be 
written back to Rn if the write-back bit is 
set or left unchanged if desired. When 
a register is used as the offset, it can 
be pre-scaled by the barrel shifter in a 
similar manner as with data processing 
instructions. 


Data transfer instructions can manipu- 
late bytes or words in memory. When a 
byte is read from the memory, it is 
placed in the low-order 8-bits of the 
register and zero-extended to a full 
word. For byte writes the lower 8-bits of 
the register are replicated onto all four 
bytes of the data bus. The memory 


controller should be designed such that ( 


only the addressed byte is updated in 
the memory. 


( 


( 


¢ 
t 
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Words are written into the address 
space as most-significant byte first. 
That is, the byte at the lowest address 
will be found left justified in a register 
and its memory location "BigEndian” 
fashion. See Appendix 1 for details of 
word and byte registration. 


The APRM supports both logical and 
physical address spaces at a lower 
level in hardware than other proces- 
sors. Data transfer instructions contain 
a translate enable bit that allows non- 
user mode programs to select the 
logical or physical address space as 
desired. The bit from the instruction is 
placed on the TRAN pin of the proces- 
sor to signal an external memory 
management unit (MMU) whether to 
translate first or pass the address from 
the processor bus to the memory. This 
allows programs executing in the 
supervisor or interrupt modes to have 
easy access to user memory areas for 
page fault correction or to have bounds 
checking performed on dynamic data 
structures in the system space by the 
MMU. In the user mode, addresses are 
always translated by the MMU ff it is 
implemented in the system. 


The block data transfer instructions 
allow multiple registers to be moved in a 
single instruction. The instruction has a 
field containing a bit for each of the 
sixteen registers visible in the current 
mode. Bit 0 corresponds to RO, and bit 
15 corresponds to R15, the program 
counter. A bit set in a particular position 
means that the corresponding register 
will be affected by the transfer. The 
registers are always saved from lowest 
to highest, and RO will always appear at 
a lower address than R1. The ability to 
pre- or post- increment or decrement 
allows both stacks and queues to be 
implemented efficiently with any 
convention chosen by the programmer. 


The branch instruction has two forms, 
branch and branch-with-link. The 
branch instruction causes execution to 
start at the current program counter 
plus a 24-bit offset contained in the in- 
struction. The offset is left-shifted by 
two bits (forming a 26-bit address) 
before it is added to the program 
counter. Since all instructions are word- 
aligned, a branch can reach any 
location in the program address space. 
The branch-with-link instruction copies 


TABLE 5. INSTRUCTION EXECUTION TIMES 


Operation Time Source Shift Modification 
RS-#- RO | 1s _| _1StorShit(RS) | 18+ 1Nif PC Modified 
RS+RS- RD} 1s __——|_tSforShit(RS) | 18+ 1Nif PC Modified 
LOR | asein | 18+ IN PC Modified 
STR a 

LOM f(nt+1)S+in | | 18 + 1Nif PC Modified 
STM p(ne-)Se2N | 

BR 2S+iN | 

BR & LINK 2S + 1N i 

Swi sein | 

MUL, MLA ae 


* - The number of registers transfered in a Load/Store Multiple instruction. If the 

condition field in an instruction is not true, the instruction is skipped and the execu- 
tion time is 1S cycle. 
** - This is the worst case time. The actual time is a function of the value in the Rs 


register. 
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the program counter and processor 
status register into R14 prior to branch- 
ing to the new address. Returning from 
the branch-with-link simply involves 
reloading the program counter from R14 
(MOV PC,R14). The PSR can option- 
ally be restored from R14 (MOVS 
PC,R14). 


The software interrupt instruction format 
is used primarily for supervisor service 
calls. When this instruction is executed, 
the PC and PSR are saved in R14 SUP. 
The PC is then set to the SWI vector 
location and the processor placed in the 
supervisor mode. 


Instructions operate at speeds depend- 
ent upon the options selected. Table 5 
shows the instruction types, execution 
rates and adjustments for operand 
shifting or affecting the program 
counter. The table is expressed in terms 
of N and S cycles representing Non- 
sequential and Sequential cycles 
respectively. The processor is able to 
take advantage of memories that have 
faster access times when accessed 
sequentially in the nibble or column 
mode. These faster cycles are desig- 
nated as S-cycies, while the N-cycles 
typically take twice as long. If faster 
static memory is used, the N and S 
cycles would be equal. 


The APRM is offered in two packages, 
a 100-pin ceramic pin grid array 
(CPGA) package and a 100-pin quad 
plastic flatpack (QPFP). 


S implies a sequential cycle. 
N implies a non sequential cycle. 


4 
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EXAMPLES OF THE INSTRUCTION SET 


The following examples illustrate methods by which basic APRM instructions can be combined to yield efficient code. None of the 
methods saves a large amount of execution time, although they all save some, mostly they result in more compact code. 


EXAMPLE 1 - USING THE CONDITIONAL EXECUTION FOR THE LOGICAL-OR FUNCTION 


> IF Rn =p OR Rm =q THEN 
; GOTO Label 


; if Rn not equal p, try other test 


CMP  Rn,p 
BEQ Label 
CMP Rm,q 
BEQ Label 
By using conditional execution, the routine compresses to: 
CMP  An,p 
CMPNE Rm, q 
BEQ Label 
EXAMPLE 2- ABSOLUTE VALUE 
TEQ Rn, 0 


RSBMI Rao, Rn, 0 


; check sign 
; and 2’s complement if required 


EXAMPLE 3 - UNSIGNED 32-BiT MULTIPLY 
- Enter with numbers in Ra, Rb - product contained in Rm 


MOV" Rm, 0 

LOOP MOVS Ra, RaLsR 1 
ADDCS Rm, Rm, Rb 
ADD Rb, Rb, Rb 
BNE LOOP 


; init result register 
; stops when Ra becomes zero 
“Rm = Ra * Rb 


; (Ra = 0, Rb is altered ) 


EXAMPLE 4 - MULTIPLICATION BY 4, 5, OR 6 AT RUN TIME 


MOV Re, Ra, LSL 2 
CMP _=siRb, 5 
ADDCS Rc, Re, Ra 
ADDHI Re, Re, Ra 


EXAMPLE 5 - MULTIPLICATION BY CONSTANT (24N)+1 USING THE BARREL SHIFTER (3,5,9,17, ...) 


ADD Ra, Ra, LSLn 


; multiply by 4 

; test multiplier value 

; complete multiply by 5 
; complete multiply by 6 


EXAMPLE 6 - MULTIPLICATION BY CONSTANT (24N) - 1 (3, 7, 15, ...) 


RSB Ra, Ra, Ra, LSLn 
EXAMPLE 7 - MULTIPLICATION BY 6 


ADD Ra, Ra, Ra LSL 1 
MOV _ Ra, RaLSL 1 


; multiply by 3 
; and then by 2 


EXAMPLE 8 - MULTIPLY BY 10 AND ADD EXTRA NUMBER (DECIMAL TO BINARY CONVERSION) 


ADD Ra, Ra, Ra LSL2 
ADD Aa,fe,Ra LSL1 


EXAMPLE 9 - DIVISION AND REMAINDER 


- enter with numbers in Ra and Rb 
MOV Rent, 1 

DIiVi CMP _ Rb, Ra 
MOVCC Rb, Rb LSL 1 
MOVCC Rent, Rent ASL 1 


BCC ODIV1 
MOV Fc,0 
DIV2 CMP Ra, Rb 


SUBCS Ra, Ra, Rb 
ADDCS Re, Re, Rent 
MOVS_ Rent, Rent, LSR 1 
MOVNE Rb, Rb LSR 1 
BNE DIV2 
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; multiply by 5 
; multiply by 2 and add in next digit 


: bit to control the division 

; move Rb until greater than Ra 
- result in Re 

- remainder in Ra 


; test for possible subtraction 
; subtract if valid 

; put relevant bits in result 

; shift control bit 

; halve unless finished 


10 
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In the following tables -MREQ and SEQ __ branch destination in the first cycle, oe ue cycle pertorns . fetch from 
(which are pipelined up to one cycle while performing a prefetch from the is eestination . 4, reiling the mee: 
ahead of the cycle to which they apply) current PC. This prefetch is done in all ey pipeline, and if the branch re with 
are shown in the cycle in which they cases. ibe als eo moninee (four ts subtracted 
appear, so they predict the address of from it) to simplify return from SUB PC, 
the next cycle. The address bus value, punng iesecone cyce arene R14, #4 to MOV PC,R14. This makes 
BMW, -RM, and -OPC (which appear performed from the branch destination, the STM. .(R14) LDM .. (PC) type of 
, ' and the return address is stored in subroutine work correctly. 


up to half a cycle ahead) are shown in 


the cycle to which they apply. register 14 if the link bit is set. 


BRANCH AND BRANCH WITH LINK 
A branch instruction calculates the 


TABLE 6. BRANCH AND BRANCH WITH LINK 


| Address_|-pyw |-R/w | Data__| sea |-MREQ| -oPc 
a ee 
re ee 
pats | 1 | o | atu | 1 | oo | 


(PC is the — of the branch instruction, ALU is an address calculated by the processor (ALU) are the contents of that ad- 
dress, etc). 


DATA OPERATIONS tion, and the result (when required) is register, an additional datapath cycle 

A data operation executes in a single written to the destination register. occurs before the above operation to 
datapath cycle except where the shift is (Compares and tests do not produce copy the bottom eight bits of that 
determined by the contents of a results, only the ALU status flags are register into a holding latch in the barrel 
register. A register is read onto the A changed). shifter. The instruction prefetch will 


Bus, and a second register or the 
immediate field onto the B Bus. The 
ALU combines the A Bus source and 
the shifted B Bus source according to 


occur during this first cycle, and the 
operation cycle will be internal (i.e., will 
not request memory). The memory 
interface may be designed such that 


An instruction prefetch occurs at the 
same time as the above operation, and 
the program counter is incremented. 


the operation specified in the instruc- When the shift length is specified by a this internal cycle can be configured to 
ee ee a ree a ee merge with the next cycle into a single 
memory N-cycle. 

TABLE 7. DATA OPERATIONS The PC may be any (or all) of the 

Type___| Cycle | Address|-aw | -RwW | Deta | SEQ |-MREQ|-opc  _—register operands. When read onto the 
Ts [ros [1 [0 [om [1 [0 | o A 8us appears without the PSR bits, 

Normal on the B Bus it appears with them. 
a ee ee ee ee eee eee Neither will affect external bus activity. 


p41 poe | 1 | o | ce | oO | Oo | When it is the destination, however, 
perc [2 [au [+ | o |wmu | 1 | 0 | 0 extemal bus activity maybe affected. I 

Pe [awe [fo Pau [1 fo [ro ene ae eine ar 

a ee ee ee ee ee ee | invalidated, and the address for the next 

1 [ros | 1 | o | ce | o | 1 | instruction prefetch is taken from the 
snnasy [2 [rez [x | o [ - [+ [0 [1 AlUrather than the addrassincramen- 

a er ee es es False ain durherercciicn aikes 
place, and during this time exceptions 
are locked out. 


pesepe | 3 | au [1 | o [am [1 | o 
a ee ee 
a ere 


oOo |~ |O 
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INSTRUCTION CYCLE OPERATIONS (Cont.) 


MULTIPLY AND MULTIPLY 
ACCUMULATE 

The multiply instructions make use of 
special hardware which implements a 


Booth’s algorithm with early termination. 


During the first cycle the accumulate 
register is brought to the ALU, which 
either transmits it or produces zero 


Type __| Cycle | Address| -ew | -RW | Data | sea _|-MREQ| -oPc 


(according to whether the instruction is 
MLA or MUL) to initialize the destination 
register. During the same cycle one of 
the operands is loaded into the Booth's 
shifter via the A Bus. 


The datapath then cycles, adding the 
second operand to, subtracting it from, 
or just transmitting, the result register. 


The second operand is shifted in the 
Nth cycle by 2N or 2N + 1 bits, under 
control of the Booth’s logic. The first 
operand is shifted right two bits per 
cycle, and when it is zero the instruction 
terminates (possibly after an additional 
cycle to clear a pending borrow). All 
cycles except the first are internal. 


lf the destination is the PC, all writing to 
it is prevented. The instruction will 


proceed as normal except that the PC 
(Re) = 0,1 fea} tf ope) | of ° will be unaffected. (If the S bit is set the 
| 2 | Pc+2 6 ee ! PSR flags will be meaningless). 
| | POet2 eS re ae aa 
ps | Poe | o | ce) | o | 1 | 
| 2 [ecrz | 1 | o |- {| o | i | 
eyo Lt Peete poo |- | oo fl 
| mM |pon2 | 1 | o [- | o | i | 3 (kth the ndinberieycbu required ietthe 
| Met | pcet2 | 1 | o |- {| 1 | o | 1 Booth’s algorithm; see the section on 
re instruction speeds 
LOAD REGISTER TABLE 9. LOAD REGISTER 


The first cycle of a load register 
instruction performs the address 
calculation. The data is fetched from 
memory during the second cycle, and 
the base register modification is 
performed during this cycle (if re- 
quired). During the third cycle the data 
is transferred to the destination 
register, and external memory is 
unused. This third cycle may normally 
be merged with the following prefetch 
to form one memory N-cycle. For 
details of registration during the load 
operation see Appendix 1. 


Either the base or the destination (or 
both) may be the PC, and the prefetch 
sequence will be changed if the PC is 
affected by the instruction. 


The data fetch may abort, and in this 


case the base and destination modifi- 
cations are prevented. 


(PC’ is the PC value modified by write 
back; t shows the cycle where the force 
translation option in the instruction may 
be used). 
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Type _| Cycle] Address | B/W |-AW| Deta___| SEQ_|-MREQ |-OPC |-TRAN / 
-1 [ros | 1 | o |cpc | o | o | o 

p2 jau lew] o |mm fo] 1 [1 {et 

3 | pez | 1 | o a 


Normal 


pt | poe | 1 

p2 [au jew! o [au jo} 1 |r fe 
p33 [rci2 | 1 fo | - Jo | o fa 
4 fay | fo fay fs | o fo | 
5 | aus | 1 fo | (atu) | 1 | oo | oo 
a Te ee ee ee 
|i [rcs | 1 {| o | ce | o | o | 0 
p2 [au few] o jm [of] i ft ft 

peo Of 1) fo | Uo 


Dest = PC 


Co twats To Touee ts to te 
p twos} | ot | 
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INSTRUCTION CYCLE OPERATIONS (Cont.) 
TABLE 10. STORE REGISTER 


STORE REGISTER 

The first cycle of a store register is 
similar to the first cycle of load register. 
During the second cycle the base 
modification is performed, and at the 
same time the data is written to 
memory. There is no third cycle. 


The PC will only be modified ff it is the 
base and write back occurs. A data 
abort prevents the base write back. 


See Appendix 1 for memory registration 
details. 


LOAD MULTIPLE REGISTERS 

The first cycle of LDM is used to 
calculate the address of the first word to 
be transferred, while performing a 
prefetch from memory. The second 
cycle fetches the first word, and 
performs the base modification. During 
the third cycle, the first word is moved 
to the appropriate destination register 
while the second word is fetched from 
memory, and the modified base is 
moved to the ALU A Bus input latch for 
holding in case it is needed to patch up 
after an abort. The third cycle is re- 
peated for subsequent fetches until the 
last data word has been accessed, then 
the final (internal) cycle moves the last 
word to its destination register. The last 
cycle may be merged with the next 
instruction prefetch to form a single 
memory N-cycle. 


lf an abort occurs, the instruction 
continues to compietion, but all register 
writing after the abort is disabled. The 
final cycle is altered to restore the 
modified base register (which may have 
been overwritten by the load activity 
before the abort occurred). If the PC is 
the base, write back is prevented. 


When the PC is in the list of registers to 
be loaded, and assuming that no abort 
takes place, the current instruction 
pipeline must be invalidated. Note that 
the PC is always the last register to be 
loaded, so an abort at any point will 
prevent the PC from being overwritten. 
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TABLE 11. LOAD MULTIPLE REGISTERS 
| Cycle | Address | -erw | -Rw | date | Se |-wREQ | -oPc 


N Registers, 
(N>1) 


Po 
pines [pcos | 1 | 0 | rcv | 1 | o 
a ee ee eee 
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INSTRUCTION CYCLE OPERATIONS (Cont. 
TABLE 12. STORE MULTIPLE REGISTERS 


STORE MULTIPLE REGISTERS 
Store multiple proceeds very much as 
load muttple, without the final cycle. 
The restart problem is much more 
straightforward here, as there is no 
wholesale overwriting of registers with 
which to contend. 


SOFTWARE INTERRUPT AND 
EXCEPTION ENTRY 

Exceptions (and software interrupts) 
force the PC to a particular value and 
refill the instruction pipeline from there. 
During the first cycle the forced address 
is constructed, and a mode change may 
take place. The return address is 
moved to register 14. 


During the second cycle the return 
address is modified to facilitate return, 
though this modification is less useful 
than in the case of branch with link. 


The third cycle is required only to 
complete the refilling of the instruction 


pipeline. 


UNDEFINED INSTRUCTIONS AND 
COPROCESSOR ABSENT 

When a Co-Processor detects a Co- 
Processor instruction which it cannot 
perform, and this must include all 
undefined instructions, it must not drive 
CPA or CPB. These will float high, 
causing the undefined instruction trap to 
be taken. 


UNEXECUTED INSTRUCTIONS 

Any instruction whose condition code is 
not met will fail to execute. It will add 
one cycle to the execution time of the 
code segment in which it is embedded. 
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(For software interrupt PC is the 
address of the SWI instruction, for 
interrupts and reset PC is the address 
of the instruction following the last one 
to be executed before entering the 
exception, for prefetch abort PC is the 
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address of the aborting instruction, for 
data abort PC is the address of the 
instruction following the one which 
attempted the aborted data transfer. Xn 
is the appropriate trap address). 


TABLE 14. UNDEFINED INSTRUCTIONS AND 
COPROCESSOR ABSENT 


cle | Address |-Bw |-Rw| Deta_| sea |-wrea |-oPc | -crt | cpa | cre 


14 


( 
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INSTRUCTION CYCLE OPERATIONS (Cont.) 


INSTRUCTION SPEEDS following table presents the incremental 
Due to the pipelined architecture of the number of cycles required by an 

CPU, instructions overlap considerably. instruction, rather than the total number 
In a typical cycle one instruction may be of cycles for which the instruction uses 
using the data path while the next is part of the processor. Elapsed time (in 
being decoded and the one after that is cycles) for the routine may be calcu- 
being fetched. For this reason the lated from these figures. 


If the condition is met the instruction 
execution time is shown in Table 16 
below. 


TABLE 16. INSTRUCTION SPEEDS 


Instruction Instruction Timing 
Type Equation 

Data Processing 1S 

Data Processing With Register Controlled Shift 184185 

Data Processing With PC Modified 2S +1N 

Load Register 18 +1N+4i1I 
Load Register With PC Loaded 2S +2N +11 
Store Register. 2N 

Load Multiple nS+iN+1 
Load Multiple With PC Loaded (n+1)S +2N4+11 
Store Multiple (n-1)S + 2N 
Branch and Branch With Link 2S +1N 
Software Interrupt, Trap 2S + 1N 

Multiply and Multiple With Accumulate 1S+ml 
Coprocessor Data Operation 1S+bl 

Load or Store Coprocessor Data To Memory 1S + 2N + bl 
Move From Coprocessor To VL86C010 Register | 1S + b1+1C 


Move From VL86C010 To Coprocessor Register 


1S + (b+1)14+1C 


n_ is the number of words transferred. 


m_ is the number of cycles required by 
the multiply algorithm, which is de- 
termined by the contents of Rs. 
Multiplication by any number 
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between 24(2m-3) and 24(2m-1)-1 
inclusive takes m cycles for m>1. 
Multiplication by 0 or 1 takes 1 cycle. 
The maximum value m can take is 
16. 


| is an incremental cycle. 


b is the number of cycles spent in the 
Co-Processor busy-wait loop. 


if the condition is not met all instruc- 
tions, take one S cycle. 
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TIMING CHARACTERISTICS: TA < 0°C to +70°C, VCC = 5 V+5% 


va 
pas Lins 


Symbol 
tCK 


tCKL 
tCKH 
tABE 
tABZ 
tALE 


tALEL 
tADDRS 
tADDRN 
tADRNA 
tADRMS 


tAH 
tDBE 
tDBZ 
tDOUT 
tDOH 
tDIS 
tDIH 
tDISN 
tDIHN 
tABTS 
tABTH 
tlRS 
tRWD 
tRWH 
tMSD 
tMSH 
tBWD 
tBWH 
tMDD 


tMDH 
Notes: 


|= | 10000 
[Clock Perodtow | ts |= | 10000 | ns 


Clock Period High | tg | 
Address Bus Enable 

Address Bus Disable 

Address Latch Fall-Through 

ALE Low Time 

CLK Rising Edge To Address Valid Delay 
CLK Falling Edge To Address Valid Delay 
NADR To Address Valid Delay 
MSBLOW To Address Valid Delay 
Address Bus Hold Time 

Data Bus Enable Time 


= 
pe 
p= 
p= 
pe 
po - 
[os 
ae 
= 
ae 
Data Bus Disable Time a 
Data Bus Output Delay or 
Data Bus Hold Time Le 
Data In Setup Time To CLK L a 
Data In Hold Time To CLK og 
Data In Setup Time To NADR Fp 
Data In Hold Time To NADR Pg | 
ABORT Setup Time | ag | 
ABORT Hold Time Fg 
Interrupt Setup Time as 
CLK To -R/W Valid a 
—RW Hold Time |e 
CLK To -MREQ And SEQ Delay ey 
—MREQ And SEQ Hold Time fg 
CLK To -B/W Valid ae 
—B/W Hold Time pg 
CLK To -M1, - MO Valid — 
M1 - MO Hold Time Pg 
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See Note 1 


See Note 2 


1. ALE controls a dynamic storage latch; this parameter is specified to ensure that the stored charge cannot leak sufficiently to 
generate intermediate logic levels in the associated logic. 
2. The interrupt and reset inputs may be asynchronous. This time will guarantee that the interrupt request is latched during this 


cycle. 
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TIMING CHARACTERISTICS: TA =0°C to +70°C, VCC = 5 V +5% 


VL2340 
Symbol Parameter 


Conditions 
(PCD 
1OPCH 
{TRMD 
{TRMH 
See Note 1 


{TROH 
\OPS 
\CPH 
OPI 
opin | ~cPiHodtime | P| os 
HT | CLKToPGHTT Delay | | Tt | 
HTH | PGHITHod Time CT 
INC | CLK ToIncremented Address Delay | 12 | | as | ns 


| Min, 
——— 
castade 
p= 
pe 
TROD | CLKTo-TRANVaid |= 
= 
ccnclcas 
—_— 
pe 


Notes: 
1.—TRAN will only change during CLK high as the result of a forced translation single data transfer operation while in the User 
mode. Otherwise it will change during CLK low when the mode change to/from User mode occurs. 
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TIMING DIAGRAMS 


PROCESSOR DATA BUS 
tCK 
tCKH 

- = ae 
ALE 

Paanmnancanas 1082 

Sy ee 

tALE tAH 


tADDRS 


tA 


ABE 

BE 
pos. nay 
" ‘> ae 


CF, 
4¢, 


a 
Wy, 


Yan 


: 


DBE 


D31 - DO 
(Read) 


NADR 


MSBLOW 


PCH W]@q’q»q’wt. 


tABTS tABTH 


ABRT 


tiIRS 
—-FIRQ, 
-IRQ 
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TIMING DIAGRAMS 
PROCESSOR CONTROL SIGNALS 


tCK 


tCKH 
CLK tCKL 


a ——— 
0 | }W//WUax._| 
Se eee //, Zam 
e/a es 


i 


— = 


; fi o— 


<4 CPI tCPH tCPIH 
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ABSOLUTE MAXIMUM RATINGS 


Ambient Operating Stresses above those listed may cause 
Temperature —10°C to + 80°C permanent damage to the device. 
Storage Temperature 65°C to + 150°C These are stress ratings only. Func- 


tional operation of this device at these 


Supply Voltage to or any other conditions above those 
Ground Potential -0.5 V to VCC + 0.3 V | 


Applied Output 


Voltage -0.5 V to VCC +0.3 V 
Applied Input 
Voltage -0.5 Vto+7.0V 


indicated in this data sheet is not 
implied. Exposure to absolute maxi- 
mum rating conditions for extended 
periods may affect device reliability. 


DC CHARACTERISTICS: TA = 0°C to +70°C, VCC =5 V+ 5% 


Se 
VOHT | Out Hgh Voge TTLDATABUS [vec-a7e| - | voc |v _ 
vont [eeernenies eons [PT an Tv 
vou? | omparignvotagecwos | veo-ars| - | veo | v_| 
a remionvnmowes T= [= oe 

Aten rmones [ee [= [cows |v 


Conditions 
V IOH = -5.0 mA 
V lOL = 5.0 mA 
V IOH = -2.5 mA 
V lOL = 2.5 mA 


<i< |< 


Tavones | os | - | os | v_ 


r inputueatagecurent | ~ —~| - '| 10 
po- [- | w | 4 
cc | Operating Supply Current | | 20 || mA 


ILO Output Leakage Current 


< 


os ___| Cupurshortcircuitcurrent | | - | 40 | ma 


CAPACITANCE: TA: 25°C, f = 1.0 MHz 


symbol | Parameter | in. | Max | unit | Conditions 


or | Glockinput Capacitance (01,02) | - | 1s |p 
| Other Input Capacitance | - | |r 
co | Oupurcepactence | - | |r 


FIGURE 3. TEST WAVEFORMS 


Input Output V1 LOAD = 2.4 V, DATABUS 

| | V1 LOAD = 2.3 V, OTHERS 
3 R1 = 1602, DATABUS 
R1 = 7502, OTHER OUTPUTS 
C1 = 100 pF, DATABUS 
C1 = 50 pF, CPI, ADDR.BUS 
C1 = 15 pF, OTHER OUTPUTS 


1. Measured with outputs unloaded, at 10 MHz. Add 4 mA per MHz. 
2. Periodically sampled, rather than 100% tested. 
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VIN =0 V (Note2) 
VIN = 0 V (Note2) 
VOUT = 0 V (Note 2) 


|__| vin-ovtovec 


VOUT #0 V to VCC 
(Note 1) 


FIGURE 4. TEST LOAD CIRCUIT 


i" Device Under Test 


V1 LOAD 


Ri 
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PROGRAMMERS’ MODEL 
The APRM has a 32-bit data bus and a 
32-bit address bus although only 26-bits 
(64M-bytes) may be used for program 
space. The processor supports two 
data types, eight bit bytes and 32-bit 
words. Instructions are exactly one 
word, and data operations (e.g., ADD) 
are only performed on word quantities. 
Load and store operations can transfer 
either bytes or words. The APRM 
supports four modes of operation, 
including protected supervisor and 
interrupt handling modes. 


BYTE SIGNIFICANCE 

Some programming techniques may 
write a 32-bit (word) quantity to mem- 
ory, but will later retrieve the data as a 
sequence of byte (8-bit) items. For 
these purposes, the processor stores 
word data in most-significant-first (MSB 
first) order. This means that the most 
significant bytes of a 32-bit word 
occupies the lowest byte address. The 
byte address values are illustrated in 
Figure 5. 


REGISTERS 

The processor has 27 registers (32-bits 
each), 16 of which are visible to the 
programmer at any time. The visible 
subset depends on the current proces- 
sor mode; special registers are 
switched in to support interrupt and 
supervisor processing. The register 
bank organization is shown in Table 16. 


User mode is the normal program 
execution state; registers R15 - RO are 
directly accessible. 


All registers are general purpose and 
may be used to hold data or address 
values, except that register R15 
contains the Program Counter (PC) and 
the Processor Status Register (PSR). 
Special bits in some instructions allow 
the PC and PSR to be treated together 
or separately as required. Figure 6 
shows the allocation of bits within R15. 


R14 is used as the subroutine link 
register, and receives a copy of R15 
when a Branch and Link instruction is 
executed. it may be treated as a 
general purpose register at all other 
times. R14_svc, R14_irq and R14 firq 
are used similarly to hold the return 
values of R15 when interrupts and 
exceptions arise, or when Branch and 
Link instructions are executed within 
supervisor or interrupt routines. 


FIRQ Processing - The FIRQ mode 
(described in the Exceptions section) 
has seven private registers mapped to 
R14 - R8 (R14_fiq-R8_fiq). Many FIRQ 
programs will not need to save any 
registers. 


IRQ Processing - The IRQ state has 
two private registers mapped to R14 
and R13 (R14_irq and R13_irq). 


FIGURE 5. BYTE SIGNIFICANCE OF APRM 


r. 0000 . 0001 0 
Byte Addr. 0004] Byte Addr. 0005| Byte Addr. 0006] Byte Addr. 0007} 0001 


RO General 
R1 General 
R2 General 
R3 General 
R4 General 
R5 General 
R6 General 
R7 General 
R& General 
R9 General 
R10 General 
R11 General 

R12 (FP) General 

R13 (SP) 

R14 (LK) 

R15 (PC) 


Word 
Address 
0 Value 


0000 


. 


General Usage 


Data Frame (by convention) 
Stack Pointer (by convention) 
R15 Save Area for BL or Interrupts 
System Program Counter 
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Supervisor Mode - The SVC mode 
(entered on SWI instructions and other 
traps) has two private registers mapped 
to R14 and R13(R14_sve and 

R13_ sve). 


The two private registers allow the IRQ 
and supervisor modes each to have a 
private stack pointer and link register. 
Supervisor and IRQ mode programs are 
expected to save the User state on their 
respective stacks and then use the User 
registers, remembering to restore the 
User state before returning. 


In User mode only the N, Z, C, and V 
bits of the PSR may be changed. The |, 
F, and Mode flags will change only 
when an exception arises. In supervi- 
sor and interrupt modes ail flags may be 
manipulated directly. 


EXCEPTIONS 

Exceptions arise whenever there is a 
need for the normal flow of program 
execution to be broken, so that (for 
instance) the processor can be diverted 
to handle an interrupt from a peripheral. 
The processor state just prior to 
handling the exception must be 
preserved so that the original program 
can be resumed when the exception 
routine has completed. Many excep- 
tions may arise at the same time. 


The processor handles exceptions by 
using the banked registers to save 
state. The old PC and PSR are copied 
into the appropriate R14, and the PC 


and processor mode bits are forced to a 
value which depends on the exception. 
Interrupt disable flags are set where 
required to prevent unmanageable 
nestings of exceptions. In the case of a 
reentrant interrupt handler, R14 should 
be saved onto a stack in main memory 
before re-enabling the interrupt. When 
multiple exceptions arise simultane- 
ously a fixed priority determines the 
order in which they are handled. 


FIRQ - The FIRQ (Fast Interrupt 
Request) exception is externally 
generated by taking the -FIRQ pin low. 
This input can accept asynchronous 
transitions, and is delayed by one clock 
cycle for synchronization before ft can 
affect the processor execution flow. It is 
designed to support a data transfer or 
channel process, and has sufficient 
private registers to remove the need for 
register saving in such applications, so 
that the overhead of context switching is 
minimized. The FIRQ exception may 
be disabled by setting the F flag in the 
PSR (but note that this is not possible 
from user mode). If the F flag is clear 
the processor checks for a low level on 
the output of the FIRQ synchronizer at 
the end of each instruction. 


The impact upon execution of an FIRQ 
interrupt is defined in Table 18. The 
return-from-interrupt sequence is also 
defined there. This will resume execu- 
tion of the interrupted code sequence, 
and restore the original processor state. 


FIGURE 6. PROGRAM COUNTER AND PROCESSOR STATUS REGISTER 


31 26 25 16 15 


210 


NMR 


L FIRQ Disable 


Program Counter 
0 = Enable (Word Aligned) 
1 = Disable 
P Mode 

IRQ Disable proscar Saree 

a mieetts 01 = FIRQ Mode 

eer 10 = IRQ Mode 

11 = Supervisor Mode 

Overflow 


Carry/Not Borrow/Rotate Extend 


Zero 


Negative/Signed Less Than 


IRQ - The IRQ (Interrupt Request) 
exception is a normal interrupt caused 
by a low level on the —IRQ pin. It has a 
lower priority than FIRQ, and is masked 
out when a FIRQ sequence is entered. 
Its effect may be masked out at any 
time by setting the | bit in the PC (but 
note that this is not possible from user 
mode). If the | flag is clear, the proces- 
sor checks for a low level on the output 
of the IRQ synchronizer at the end of 
each instruction. 


The impact upon execution of an IRQ 
interrupt is defined in Table 18. The 
return-from-interrupt sequence is also 
defined there. This will resume execu- 
tion of the interrupted code sequence, 
restore the original processor state, and 
reenable the IRQ interrupt. 


Abort - The ABORT signal comes from 
an external memory management 
system, and indicates that the current 
memory access cannot be completed. 
For instance, in a virtual memory 
system the data corresponding to the 
current address may have been moved 
out of memory onto a disk, and consid- 
erable processor activity may be 
required to recover the data before the 
access can be performed successfully. 
The processor checks for an abort at 
the end of the first phase of each bus 
cycle. When successfully aborted, the 
APRM will respond in one of three 
ways: 


(i) tf the abort occurred during an 
instruction prefetch (a prefetch 
abort), the prefetched instruction is 
marked as invalid; when it comes to 
execution, it is reinterpreted as 
below. if the instruction is not 
executed, for example as a result of 
a branch being taken while it is in 
the pipeline, the abort will have no 
effect.) 


(ii) tf the abort occurred during a data 
access (a data abort), the action 
depends on the instruction type. 
Data transfer instructions (LDR, 
STR) are aborted as though they 
had not executed. The LDM and 
STM instructions complete, and if 
writeback is set, the base is up- 
dated. If the instruction would 
normally have overwritten the base 
with data (i.e. LDM with the base in 
the transfer list), this overwriting is 
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prevented. All register overwriting is 
prevented after the abort is indi- 
cated, which means in particular that 
R15 (which is always last to be 
transferred) is preserved in an 
aborted LDM instruction. 


(iii) ff the abort occurred during an 
internal cycle it is ignored. 


Then, in cases (i) and (ii), the processor 
will respond as defined in Table 18. 


The return from Prefetch Abort defined 
in the Figure will attempt to execute the 
aborting instruction (which will only be 
effective if action has been taken to 
remove the cause of the original abort). 
A Data Abort requires any auto- 
indexing to be reversed before returning 
to re-execute the offending instruction. 
The return is performed as defined in 
the Figure. 


The abort mechanism allows a demand 
paged virtual memory system to be 
implemented when a suitable memory 
management unit is available in the 
system. The processor is allowed to 
generate arbitrary addresses, and when 
the data at an address is unavailable, 
the memory manager signals an abort. 
The processor traps into system 
software which must work out the cause 
of the abort, make the requested data 
available, and retry the aborted instruc- 
tion. The application program needs no 
knowledge of the amount of memory 
available to it, nor is its state in any way 
affected by the abort. 


Software interrupt - The software 
interrupt is used for getting into supervi- 
sor mode, usually to request a particular 
supervisor function. The processor 
response to the (SWI) instruction is 
defined in Table 18, as is the method of 
returning. The indicated return method 
will return to the instruction following the 
SWI. 


Undefined instruction Trap - When 
the APRM executes a coprocessor in- 
struction or an undefined instruction, it 
offers it to any coprocessors which may 
be present. if a coprocessor can 
perform this instruction but is busy at 
that moment, the processor will wait 
until the coprocessor is ready. If no 
coprocessor can handle the instruction 
the APRM will take the undefined in- 
struction trap. 
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The trap may be used for software 
emulation of a coprocessor in a system 
which does not have the coprocessor 
hardware, or for general purpose 
instruction set extension by software 


The conventional means of implement- 
ing an interrupt dispatch function is to 
provide a table of jumps to the appropri- 
ate processing table, as below: 


Address Function 
emibae!: 0000000 _—s—-Resett 
When the undefined instruction trap is 0000004 Undefined instruction 
taken the APRM will respond as 0000008 Software interrupt 
defined in Table 18. The return from 000000C Abort (prefetch) 
this trap (after performing a suitable 0000010 Abort (data) 
emulation of the required function), 0000014 Unused 
defined in the Figure will return to the 0000018 IRQ 
instruction following the undefined 000001C FIRQ 


instruction. 


Reset - When RES goes high the 
processor will stop the currently 
executing instruction and start execut- 
ing no-ops. When Reset goes low 
again it will respond as defined in 
Table 18. There is no meaningful 
return from this condition. 


These are byte addresses, and each 


Vector Table 
TABLE 18. EXCEPTION TRAP CONSIDERATIONS 


Trap Type CPU Trap Activity Program Return Sequence 


Reset 


Undefined 
Instruction 


Software 
Interrupt 


Prefetch 
and Data 
Aborts 


IRQ 


FIRQ 


1. Save R15 in R14 (SVC). 

2. Force M1:0 to SVC mode, and 
set F & | status bits in PC. 

. Force PC to 0x000000. 


. Save R15 in R14 (SVC). 

. Force M1:m0 to SVC mode, 
and set | status bit in the PC. 

. Force PC to 0x000004. 


1. Save R15 in R14 (SVC). 
. Force M1:0 to SVC mode, and 
set | status bit in the PC. 
. Force PC to 0x000008. 


. Save R15 in R14 (SVC). 

2. Force M1:0 to SVC mode, and 
set | status bit in the PC. 

. Force PC to 0x000010. 


1. Save R15 in R14 (IRQ). 
. Force M1:0 to IRQ mode, and 
set | status bit in the PC. 
. Force PC to 0x000018. 


. Save R15 in R14 (FIRQ). 

2. Force M1:0 to FIRQ mode, 
and set | status bit in the PC. 

. Force PC to 0x00001C. 


23 


(rva) 


MOVS PC, R14 


MOVS PC, R14 


Prefetch Abort: 
SUBS PC, R14,4 


Data Abort: 
MOVS PC, R14,8 


SUBS PC, R14,4 


SUBS PC, R14,4 


contains a branch instruction pointing to 
the relevant routine. The FIRQ routine 
might reside at 000001CH onwards, 
and thereby avoid the need for (and 
execution time of) a branch instruction. 


Exception Priorities - When mutiple 


- SVC's R14. 


- SVC's R14. 


; SVC's R14. 


- SVC's R14. 


- IRQ’s R14. 


; FIRQ's R14. 


4 
@_ Apple Computer, Inc. 


\ APRM / VL2340 


exceptions arise at the same time, a 
fixed priority system determines the 
order in which they will be handled: 


1) Reset (highest priority) 

2) Data aborts 

3) FIRQ 

4) IRQ 

5) Prefetch abort 

6) Undefined Instruction and 
SWis (lowest priority) 


Note that not all exceptions can occur at 
once. Undefined instruction and 
software interrupt are also mutually 
exclusive since they each correspond to 
particular (non-overlapping) decodings 
of the current instruction. 


lf a data abort occurs at the same time 
as a FIRQ, and FIRQs are enabled (i.e., 
the F flag in the PSR is clear), the 
processor will enter the data abort 
handler and then immediately proceed 


to the FIRQ vector. A normal return 
from FIRQ will cause the data abort 
handler to resume execution. Placing 
data abort at a higher priority than FIRQ 
is necessary to ensure that the transfer 
error does not escape detection, but the 
time for this exception entry should be 
reflected in worst case FIRQ latency 
calculations. 


Interrupt Latencies - The worst case 
latency for FIRQ, assuming that it is 
enabled, consists of the longest time 
the request can take to pass through 
the synchronizer (Tsyncmax), plus the 
time for the longest instruction to 
complete (Tidm, the longest instruction 
is load multiple registers), plus the time 
for data abort entry (texc), plus the time 
for FIRQ entry (Tfiq). At the end of this 
time the processor will be executing the 
instruction at 1CH. 


Tsyncemax is 2.5 processor cycles, 


Tidm is 18 cycles, Texc is three cycles, 
and Tfiq is two cycles. The total time is, 
therefore, 25.5 processor cycles, which 
is just over 2.5 microseconds in a 
system using a continuous 10 MHz 
processor clock. In a DRAM based 
system running at 4 and 8 MHz, for 
example using the VL86C110 MMU, 
this time becomes 4.5 microseconds, 
and if bus bandwidth is being used to 
support video or other DMA activity, the 
time will increase accordingly. 


The maximum IRQ latency calculation 
is similar, but must allow for the fact that 
FIRQ has higher priority and count 
delay entry into the IRQ handling 
routine for an arbitrary length of time. 


The minimum latency for FIRQ or IRQ 
consists of the shortest time the request 
can take through the synchronizer 
(Tsyncmin) plus Tfig. This is 3.5 
processor cycles. 
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INSTRUCTION SET 


All APRM instructions are conditionally 
executed, which means that their exe- 
cution may or may not take place de- 
pending on the values of the N, Z, C, 
and V flags in the PSR at the end of 
the preceding instruction. 


lf the ALways condition is specified, 
the instruction will be executed 
irrespective of the flags, and likewise 
the Never condition will cause it not to 
be executed (it will be a no-op, taking 
one cycle and having no effect on the 
processor state). 


The other condition codes have 
meanings as detailed above, for 
instance code 0000 (EQual) causes 
the instruction to be executed only if 
the Z flag is set. This would corre- 
spond to the case where a compare 
(CMP) instruction had found the two 
operands were different, the compare 
instruction would have cleared the Z 
flag, and the instruction will not be 
executed. 


FIGURE 7. CONDITION FIELD 
31 24 23 16 15 8 7 0 


Condx 


L Condition Field 


0000 = EQ - Z set (equal) 

0001 = NE - Z clear (not equal) 

0010 = CS - C set (unsigned higher or same) 

0011 =CC- C clear (unsigned lower) 

0100 = Mi - N set (negative) 

0101 = PL - V set (overflow) 

0111 = VC - V clear (no overflow) 

1000 = HI - C set and Z clear (unsigned higher) 

1001 =LS - C clear or Z set (unsigned lower or same) 

1010 = GE - N set and V set, or N clear and V clear (greater or equal) 

1011 =LT - N set and V clear, or N clear and V set (less than) 

1100 = GT - Z clear, and either N set and V set, or N clear and V clear (greater than) 
1101 = LE - Z set, or N set and V clear, or N clear and V set (less than or equal) 
1110 =AL - Always 

1111 =NV - Never 


(any instruction) 


The B and BL instructions are only exe- 
cuted if the condition code field is true. 


All branches support a 24 bit offset. The 
offset is shifted left two bits and added 
to the PC, with overflows being ignored. 
The branch can therefore reach any 
word aligned address within the 
program address space. The branch 
offset must take account of the prefetch 
operation, which causes the PC to be 
two words ahead of the current instruc- 
tion. 


Link bit - Branch with Link writes the 
old PC and PSR into R14 of the current 
bank. The PC value written into the link 
register (R14) is adjusted to allow for 


FIGURE 8. BRANCH, AND BRANCH WITH LINK (B, BL) 
31. 2827 2423 0 


Conde} 1osjt] PCRetatve Ortset 
= Link Bit 


Condition 0 = Branch 
Field 1 = Branch with Link (Subroutine call) 


Return from Subroutine - When 
returning to the caller, there is an option 
to restore or to not restore the PSR. 
The following table illustrates the 
available combinations. 


the prefetch, and contains the address 
of the instruction following the branch 
and link instruction. 


Restoring PSR: MOVS PC,R14 LDM Rohl, (PC) 
Not Restoring PSR: MOV PC,R14 LDM Ral, (PC) 
Syntax: 
B(L){cond} <expression> 
where L is used to request the Branch-with-Link form of the instruction. 
If absent, R14 will not be affected by the instruction. 
cond is a two-character mnemonic as shown in Condition Code section (EQ, NE, 
VS, etc.). If absent then AL (Always) will be used. 
expression is the destination. The assembler calculates the relative (word) offset. 


Items in { } are optional. items in < > must be present. 
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Examples: 
Here BAL Here 
B There 
CMP i‘ R1,0 
BEQ Fred 
BL ROM + Sub 
ADDS Al, 1 
BLCC Sub 
BLNV Sub 
ALU INSTRUCTION 


The ALU-type instruction is only 
executed if the condition is true. The 
various conditions are defined in the 
Condition Code section. 


The instruction produces a result by 
performing a specified arithmetic or 
logical operation on one or two oper- 
ands. The first operand is always a 
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; Assembles to EAFFFFFE. (Note effect of PC offset) 


- Always condition used as default 


; Compare register one with zero, and branch to Fred if 
; register one was zero. Else continue next instruction. 


; Unconditionally call subroutine at computed address. 


; Add one to register one, setting PSR flags on the result. 
: Call Sub if the C flag is clear, which will be the case unless 
; R1 contained FFFFFFFFH. Else continue next instruction. 


; Never call subroutine (this is a NO-OP). 


register (Rn). The second operand may 
be a shifted register (Rm) or a rotated 
eight-bit immediate value (Imm) 
according to the value of the | bit in the 
instruction. The condition codes in the 
PSR may be preserved or updated as a 
result of this instruction, according to 
the value of the S bit in the instruction. 
Certain operations (TST, TEQ, CMP, 


26 


CMN) do not write the result to Rd. 
They are used only to perform tests and 
to set the condition codes on the result, 
and therefore should always have the S 
bit set. (The assembier treats TST, 
TEQ, CMP and CMN as TSTS, TEQS, 
CMPS and CMNS by default). 


¢ 
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FIGURE 9. ALU INSTRUCTION TYPES 
31 28 1615 1211 0 


25 20 
[Condx|oo]i[opcode|s|_An | Rd | Operand2 


Ty | 
Destination Register 
Condition 1st operand register 


code Set condition codes 
0 = do not alter condition codes 
0 = Operand 2 Is a register. 
1 = Operand 2 is an 
immediate value. 
a Imm=1 --> Operand 2 is an immediate value. 
11. 87 ) 
Operation Code 
0000 = AND - Rd = Op! AND Op2 immediate 
0001 = EOR - Rd = Op! EOR Op2 
ay cee. ee ee | L Unsigned 8 bit immediate value 
aa bbc ADD - oot oes Rightrotate amount to beapplied 
0101 = ADC - Rd = Op! + Op2+C to 8-bit imm (2-bit shift units). 
0110 = SBC- Rd = Opt - Op2+C 
0111 = RSC - RD = Ope - Opt + C Imm=0 -> Operand2 is in a register. 
1000 = TST - set condition codes on Op! AND Op2 
1001 = TEQ - set condition codes on Op1 EOR Op2 "1 43 0 
1010 = CMP - set condition codes on Op! - Op2 
1011 = CMN - set condition codes on Op1 + Op2 
1100 = ORR - Rd = Opt OR Op2 | L__ ond-Operand register 
1101 = MOV - = 
1110=BIC - a eT ee Shift applied to Rm (a8 shown 
1111 =MVN- Rd = Op2 a in below expansion figures). 
11 7654 11 87654 
| aan ie | 
Shift Amount Shift T 
Shift amount is a 5-bit 50 ito Left (LSL) 
basigned Neer: 01 = Logical Right + (LSR) 
oreeamosnt 10 = Arithmetic Right (ASR) 


in bottom byte of Rs. 


ae 
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OPERATIONS 

Assembler 

Opcode Action 

AND 0000 Bit-wise logical AND of operands 

EOR 0001 Bit-wise logical Exclusive Or of operands 

SUB 0010 Subtract operand 2 from operand 1 

RSB 0011 Subtract operand 1 from operand 2 

ADD 0100 Add operands 

ADC 0101 Add operands plus carry (PSR C flag) 

SBC 0110 Subtract operand 2 from operand 1 plus carry 

RSC 0111 Subtract operand 1 from operand 2 plus carry 

TST 1000 as AND, but result is not written 

TEQ 1001 as EOR, but resutt is not written 

CMP 1010 as SUB, but result is not written 

CMN 1011 as ADD, but result is not written 

ORR 1100 Bit-wise logical OR of operands 

MOV 1101 Move operand 2 (operand 1 is ignored) 

BIC 1110 Bit clear (bit-wise AND of operand 1 and NOT operand 2) 

MVN 1111 Move NOT operand 2 (operand 1 is ignored) 
PSR Flags - The operations may be This field indicates the type of shiftto be = takes the contents of Rm and moves 
classified as logical or arithmetic. The performed (logical left or right, arithme- each bit by the specified amount to a 
logical operations (AND, EOR, TST, tic right or rotate right). The amount by more significant position. The least 
TEQ, ORR, MOV, BIC, MVN) perform which the register should be shifted significant bits of the result are filled 
the logical action on all corresponding may be contained in an immediate field with zeroes, and the high bits of Rm 
bits of the operand or operands to in the instruction, or in the bottom byte which do not map into the result are ( 
produce the result. If the S bit is set of another register as shown in Figure discarded, except that the least 
(and Rd is not R15) the V flag in the 8. significant discarded bit becomes the 


PSR will be unaffected, the C flag will 
be set to the carry out from the barrel cee ote a, latched into the C bit of the PSR when 
shifter (or preserved when the shift led Nearbecreed si ae pide the ALU operation is in the logical class 
operation is LSL 0), the Z flag will be set zero to 31. A logical shift left (LSL) (see above). For example, the effect of 
if and only if the result is all zeroes, and : LSL 5 is: 

the N flag will be set to the logical value 

of bit 31 of the result. 


The arithmetic operations (SUB, RSB, 


When the shift amount is specified in shifter carry output which may be 


ADD, ADC, SBC, RSC, CMP, CMN) FIGURE 10. LOGICAL SHIFT LEFT (LSL) 
treat each operand as a 32-bit integer 
(either unsigned or 2’s complement 31 2423 16 15 87 ) 


signed, the two are equivalent). Ifthe S 
bit is set (and Rd is nat RS) the Vflag -«(camyj@—[ TT TTT TTT TTT TTT TE] 9 


in the PSR will be set if an overflow Contents of Rm, which will appear (shifted) in Operand 2 
occurs into bit 31 of the result; this may 

be ignored if the operands were 

considered unsigned, but wars of a Carry Flag 31 24 23 16 15 87 


0 
possible eror the operands were 2 


complement signed. The C flag will be 
set to the carry out of bit 31 of the ALU, Example of shifted result in Operand 2 (shifted content of Rm) 


the Z flag will be set if and only if the 
result was zero, and the N flag will be 


e 2 the value of bit 31 of the result Note that LSL 0 is a special case, A Logical Shift Right (LSR) is similar, 

(indicating a negative result if the j but the contents of Rm are moved to 

operands are considered to be 2's where the shifter carry out is the old ee aes 

complement signed) value of the PSR C flag. The contents less significant positions in the result. ( 
ila of Rm are used directly as the second LSR 5 has the following effect: 

Shifts - When the second operand is operand. 


specified to be a shifted register, the 
operation of the barrel shifter is con- 
trolled by the shift field in the instruction. 
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FIGURE 11. LOGICAL SHIFT RIGHT (LSR) 


31 24 23 16 15 8 7 0 


a LAAN 


Contents of Rm, which will appear (shifted) in Operand 2 


16 15 Carry Flag 


100000 Upper 27 bits of Rm 


Example of shifted result in Operand 2 (shifted content of Rm) 


The form of the shift field which might ROR 0 into LSL 0, and allows LSR 32 to shift preserves the correct representa- 
be expected to correspond to LSR 0 is be specified. tion of a (signed) negative integer to be 
used to encode LSR 32, which has the The Arithmetic Shift Ri , divided by powers of two via a right 
zero result, with bit 31 of Rm as the piste ele a gh e shift. For example, ASR 5 has the 
carry output. Logical shift right zero is that the high bits are filled with repli- following effect: 

redundant, as it is the same as logical cates of the sign bit (bit 31) of the Rm 


shift left zero. Therefore, the assem- 


bler converts LSR 0, and ASR 0, and register, instead of zeros. This signed 


FIGURE 12. ARITHMETIC SHIFT RIGHT (ASR) 


31 24 23 16 15 8 7 0 
= 
mo Contents of Rm, which will appear (shifted) in Operand 2 
RARE SPADARBOR NEA PRRAARAT 23 16 15 Carry Flag 


extended) ur 3" (Sign extended) upper 27 bisofm = sSS 


faite a result in Operand 2 (shifted content of Rm) 


The form of the shift field which might bit (bit 31) of Rm. The result is therefore _shift right operation, by wrapping them 
be expected to give ASR 0 is used to all ones or all zeros, according to the around at the high end of the result. 

encode ASR 32. Bit 31 of Rm is again value of bit 31 of Rm. For example, the effect of a ROR 5 is: 
used as the carry output, and each bit 


of operand 2 is also equal to the sign Rotate Right (ROR) operations reuse 


the bits which “overshoot” in a logical 


FIGURE 13. ROTATE RIGHT (ROR) 


31 24 23 16 15 87 


Contents of Rm, which will appear (shifted) in Operand 2 


SAAaanAaE=*>7947 <4 RRRRRES 0 Carry Flag 


Lower Lower 5 | ; S Upper 27 bits of Amvaiue = CS 27 bits of Rm value 


Example of shifted result in Operand 2 (shifted content of Rm) 
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The form of the shift field which might 
be expected to give ROR 0 is used to 
encode a special function of the barrel 


shifter, Rotate Right Extended (RRX). 
This is a rotate right by one bit position 
of the 33 bit quantity formed by append- 


ing the PSR C flag to the most signifi- ( 
cant end of the contents of Rm: 


FIGURE 14. ROTATE RIGHT EXTENDED (RRX) 


31 


Contents of Rm, which will appear (shifted) in Operand 2 


3 


Lower 4|S13 Upper 27 bits of Rm val 


5 
I | 


24 23 16 15 8 7 


24 23 16 15 8 


7 0 
ue 5 


—> [cay 


Carry Flag 


Example of shifted result in Operand 2 (shifted content of Rm) 
Previous value of Carry Flag (from before the shift) 


Bit 0 of Rm register 


Register-Based Shift Counts - Only 
the least significant byte of the contents 
of Rs is used to determine the shift 
amount. If this byte is zero, the 
unchanged contents of Rm will be used 
as the second operand, and the old 


Shift 

LSL by 32 

LSL by more than 32 
LSR by 32 

LSR by more than 32 
ASR by 32 or more 
ROR by 32 

ROR by more than 32 


Immediate Operand Rotation - The 
immediate operand rotate field is a four- 
bit unsigned integer which specifies a 
shift operation on the eight bit immedi- 
ate value. The immediate value is zero 
extended to 32-bits, and then subject to 
a rotate right by twice the value in the 
rotate field. This enables many 
command constants to be generated, 
for example all powers of two. Another 
example is that the eight bit constant 
may be aligned with the PSR flags (bits 
zero, one, and 26 to 31). All the flags 
can thereby be initialized in one TEQP 
instruction. 


Writing to R15 - When Rd is a register 
other than R15, the condition code flags 


value of the PSR C flag will be passed 
on as the shifter carry output. 


lf the byte has a value between one and 
31, the shifted result will exactly match 


Action 


that of an instruction specified shift with 
the same value and shift operation. 


Shifts of 32 or More - The result will be 
a logical extension of the shifting 
processes described above: 


Resutt zero, carry out equal to bit zero of Rm. 


Result zero, carry out zero. 


Result zero, carry out equal to bit 31 of Rm. 


Result zero, carry out zero. 


Result filed with and carry out equal to bit 31 of Rm. 

Result equal to Rm, carry out equal to bit 31 of Rm. 

Same result and carry out as ROR by n-32. Therefore, repeatedly 
subtract 32 from count until within the range one to 32. 


Note: The zero in bit seven of an instruction with a register controlled shift is compulsory; a one in this bit will cause the instruc- 
tion to be a multiply or an undefined instruction. 


in the PSR may be updated from the 
ALU flags as described above. When 
Rd is R15 and the S flag in the instruc- 
tion is set, the PSR is overwritten by the 
corresponding bits in the ALU result, so 
bit 31 of the result goes to the N flag, bit 
30 to the Z flag, bit 29 to the C flag and 
bit 28 to the V flag. In user mode the 
other flags (I, F, Ml, MO) are protected 
from direct change, but in non-user 
modes these will also be affected, 
accepting copies of bits 27, 26, one and 
zero of the result respectively. 


When one of these instructions is used 
to change the processor mode (which is 
only possible in a non-user mode), the 
following instruction should not access 


a banked register (R14-R8) during its 
first cycle. A no-op should be inserted if 
the next instruction must access a 
banked register. Accesses to the 
unbanked registers (R7-RO and R15) 
are safe. 


if the S flag is clear when Rd is R15, 
only the 24 PC bits of R15 will be 
written. Conversely, if the instruction is 
of a type which does not normally 
produce a result (CMP, CMN, TST, 
TEQ) but Rd is R15 and the S bit is set, 
the result will be used in this case to 
update those PSR flags which are not 
protected by virtue of the processor 
mode. 
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R15 as an Operand - If R15 is used as When R15 appears in the Rm position it instruction, plus eight or 12 bytes due to 
an operand in a data processing will give the value of the PC together instruction prefetching. If the shift 
instruction it can present different with the PSR flags to the barrel shifter. amount is specified in the instruction, 
values depending on which operand oe the PC will be eight bytes ahead. Ifa 
position it occupies. It will always When A15 appears oe wero en register is used to specify the shift 

or Rs positions it will give the value of , 
contain the value of the PC. It may or ; amount, the PC will be eight bytes 

the PC alone, with the PSR bits 

may not contain the values of the PSR replaced by zeroes ahead when used as Rs, and 12 bytes 
flags as they were at the completion of ; ahead when used as Rn or Rm. 
the previous instruction. The PC value will be the address of the 


Syntax: 

MOV, MVN single operand instructions: 
<opcode>{cond}{S} Rd,<Op2> 

CMP, CMN, TEQ, TST - instructions not producing a resuk: 
<opcode>{cond}{P} Rn,<Op2> 

AND, EOR, SUB, RSB, ADD, ADC, SBC, RSC, ORR, BIC: 
<opcode>({cond}{S} Rd, Rn, <Op2> 


where Op2 Is Am{,<shift>} or, <expression> 
cond Two-character condition mnemonic, see Condition Code section. 
SS. Set condition codes if S present (implied for CMP, CMN, TEQ, TST). 
P Make Rd = R15 in instructions where Rd is not specified, otherwise Rd will 


default to RO. (Used for changing the PSR directly from the ALU result.) 
Rd, Rnand Rm_ Are any valid register name, such as RO-R15, PC, SP, or LK. 
<shift> ls <shiftname> <register> or <shiftname> expression, or RRX (rotate right 
one bit with extend). 
<shiftname>s Are any of: ASL, LSL, LSR, ASR, or ROR. 


Note: If <expression> is used, the assembler will attempt to generate a shifted immediate eight-bit field to match the expression. 
If this is impossible, it will give an error. 


Examples: 
ADDEQ R2, R4, RS ; Equivalent to: if (ZFLAG) R2 = R4+R5. 
TEQS R4,3 ; Test R4 for equality with 3 (The S is redundant, as the assembler 
; assumes it. Equivalent to: ZFLAG = R4éas3. 
SUB R4, RS, R7 LSR R2 ; Logical Right Shift R7 by the number in the bottom byte of R2, subtract 


; the result from R5, and put the answer into R4. 
: Equivalent to: R4 = R5 - (R7>>R2). 


TEQP R15, 0; ; (Assume non-user mode here). Change to 
; user mode and clear the N,Z,C,V,I, and F 
; flags. Note that R15 is in the Rn position, so 
; it comes without the PSR flags. 
; Equivalent to: R15 = FLAGS = 0. 


MOVNV RO, RO ; ls ano-op, avoiding mode-change hazard. 
; Equivalent to: RO = Ro. 


MOV PC, LK ; Equivalent to: PC = LK, or PC = R14. 
; Return from subroutine (R14 is an active one). 


MOVS PC, R14 : Equivalent to: PC, PSR = R14. 
; Return from subroutine, restoring the status. 


a a eae 
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FIGURE 15. MULTIPLY, AND MULTIPLY-ACCUMULATE (MUL, MLA) 


31 2827 


2 
condxfoooooolAsy Ad | rn | As [1001] Rm | 


-" 


Conditional Execution 


Control Field 


The Multiply and Multiply-Accumulate 
instructions use a two-bit Booth’s 
algorithm to perform integer multiplica- 
tion. They give the least significant 32- 
bits of the product of two 32-bit oper- 
ands, and may be used to synthesize 
higher precision multiplications. 


The Multiply form of the instruction 
gives RD = Rm*Rs. Ron is ignored, and 
should be set to zero for compatibility 
with possible future upgrades to the 
instruction set. 


The Multiply-Accumulate form gives 

Rd = Rm*Rs+Rn, which can save an 
explicit ADD instruction in some circum- 
stances. 


Both forms of the instruction work on 
operands which may be considered as 
signed (two’s complement) or unsigned 
integers. 

Operand restrictions - Due to the way 


the Booth’s algorithm has been imple- 
mented, certain combinations of 


2 19 1615 87 


0 


i Operand registers 


MUL: Rd = Rm* Rs 
MLA: Rd = Rm * Rs + Rn 


Set Condition Codes 


(Rs ts ignored) 


0 = Do not alter Condition Codes 


1 = Set Condition Codes 


Accumulate bit (MLA specifier) 


O = Multiply (MUL) 


1 = Multiply and Accumulate (MLA) 


(The assembler will issue a warning if 
these restrictions are violated.) The 
destination register (Rd) should not be 
the same as the Rm operand register, 
as Rd is used to hold intermediate 
values and Rm is used repeatedly 
during the multiply. A MUL will give a 
zero result if Rm = Rd, and a MLA will 
give a meaningless result. 


The destination register Rd should also 
not be R15, as it is protected from 
modification by these instructions. The 
instruction will have no effect, except 
that meaningless values will be placed 
in the PSR flags if the S bit is set. All 
other register combinations will give 
correct results, and Rd, Rn and Rs may 
use the same register when required. 


PSR Flags - Setting the PSR flags is 
optional, and is controlled by the S bit in 
the instruction. The N and Z flags are 
set correctly on the result (N is equal to 
bit 31 of the result, Z is set if and only if 
the result is zero), the V flag is unaf- 


fected by the instruction (as for logical 
data processing instructions), and the C 
flag is set to a meaningless value. 


Writing to R15 - As mentioned above, 
R15 must not be use as the destination 
register (Rd). ff it is so used, the in- 
struction will have no effect except 
possibly to scramble the PSR flags. 


R15 as an Operand - R15 may be used 
as one or more of the operands, though 
the result will rarely be useful. When 
used as Rs the PC bits will be used 
without the PSR flags, and the PC value 
will be eight bytes on from the address 
of the multiply instruction. When used 
as Rn, the PC bits will be used along 
with the PSR flags, and the PC will 
again be eight bytes on from the 
address of the instruction. When used 


as Rm, the PC bits will be used together 


with the PSR flags, but the PC will be 
the address of the instruction plus 12 
bytes in this case. 


operand registers should be avoided. 
Syntax 
MUL{cond}{S} Rd, Rm, Rs 
MLA {cond}{S} Rd, Rm, Rs, Rn 
where cond Is a two-character condition code mnemonic | 
S Set condition codes if present. 
Rd, Rm, Rs and Fin Are valid register mnemonics, such as RO-R15, SP, LK, or PC. 
Notes: 


Rd must not be R15 (PC), and must not be the same as Rm. 
Items in {} are optional. Those in <> must be present. 


Examples: 
MUL R1, R2, R3 
MLAEQS R1, R2, R3, R4 
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-R1 = R2* R3. (R1,R2,R3 = Rd,Rm,Rs) 
; Equivalent to: if (ZFLAG) R1 = R2°R3 + Ré4. 
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; The multiply instruction may be used to synthesize higher precision multiplications. 


; For instance, multiply two 32-bit integers and generate a 64-bit result: 


; Add middle sections. (MLA not used, as we need R3 correct). 


MOV RO, R1 LSR 16 ; RO (temporary) = top half of R1. 
MOV R4, R2 LSR 16 ; R4 = top half of R2. 

BIC R1, R1, ROLSL 16 -R1 = bottom half of R1. 

BIC R2, R2, R4 LSL 16 - R2 = bottom half of R2. 

MUL R3, RO, R2 - Low section of result. 

MUL R2, RO, R2 : Middle section of result. 

MUL R1, R4, R1 - Middle section of result. 

MUL R4, RO, R4 ; High section of result. 

ADDS R1, R2, R1 

ADDCS R4, R4, 0x10000 ; Carry from above add. 

ADDS R3, R3, R1 LSL 16 ; R3 is now bottom 32 product bits. 
ADC R4, R4, R1 LSR 16 ; R4 is now top 32 bits. 


Notes: 


1. R1,R2 are resigters containing the 32-bit integers. R3,R4 are registers for the 64-bit result. 


2. RO is a temporary register. 


3. R1 and R2 are overwritten during the multiply. 


Load/Store Value from Memory 
(LDR,STR) 

The register load/store instructions are 
used to load or store single bytes or 
words of data. The LDR and STR 
instructions differ from MOV instructions 
in that they move data between registers 
and a specified memory address. In 
contrast, the MOV instructions move data 
between registers, or move a constant 
(contained in the instruction) into a 
register. 


The memory address used in LDR/STR 
transfers is calculated by adding an offset 
to or subtracting an offset from a base 
register. Typically, a load of a labeled 
memory location involves the loading via 
a (signed) offset from the current PC. 
Regardless of the base register used, the 
resut of the offset calculation may be 
written back into the base register if ‘auto- 
indexing’ is required. 


Offsets and Auto-indexing - The offset 
from the base may be either a 12-bit 
binary immediate value in the instruction, 
or a second register (possibly shifted in 
some manner). The offset may be added 
to (U=1) or subtracted from (U=0) the 
base register Rn. The offset modification 
may be performed either before (pre- 
indexed, P=1) or after (post-indexed, 
P=0) the base is used as the transfer 
address. 


The W bit gives optional auto increment 


and decrement addressing modes. 
The modified base value may be 
written back into the base (W=1), or 
the old base value may be kept 
(W=0). In the case of post-indexed 
addressing, the write back bit is 
redundant, since the old base value 
can be retained by setting the offset to 
zero. Therefore, post-indexed data 
transfers always write back the 
modified base. 


Hardware Address Transiation - 
The only use of the W bit in a post- 
indexed data transfer is in non-user 
mode code, where setting the W bit 
forces the —-TRAN pin low for the 
transfer, allowing the operating 
system to generate a user address in 
a system where the memory manage- 
ment hardware makes suitable use of 
this pin. 


Shifted Register Offset - The eight 
shift control bits are described in the 
data processing instructions, but the 
register specified shift amounts are 
not implemented in this instruction 
class. 


Bytes and Words - This instruction 
class may be used to transfer a byte 
(B=1) or a word (B=0) between a 
processor register and memory. 


A byte load (LDRB) expects the data 
on bits D31 to D24 if the supplied 


address is on a word boundary, on bits 
D23 to D16 if it is a word address plus 
one byte, and so on. The selected byte 
is placed in the bottom eight bits of the 
destination register, and the remaining 
bits of the register are filled with zeroes. 


A byte store (STRB) repeats the bottom 
eight bits of the source register four 
times across the data bus. The external 
memory system should activate the 
appropriate byte subsystem to store the 
data. 


A word load (LDR) will normally 
generate a word aligned address but 
may also generate a non-word-aligned 
address. An address offset from a word 
boundary will cause the data to be 
rotated into the register so that the 
addressed byte position in the data 
occupies bits D31 to 024. Reference 
Appendix 1, Table 1. 


Use of R15 - These instructions will 
never cause the PSR to be modified, 
even when Rd or Rn is R15. 


lf R15 is specified as the base register 
(Rn), the PC is used without the PSR 
flags. When using the PC as the base 
register one must remember that it 
contains an address eight bytes 
advanced from the address of the 
current instruction. 


lf R15 is specified as the register offset 
(Rm), the value presented will be the 
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FIGURE 16. LOAD/STORE VALUE FROM MEMORY (LDR,STR) 
25 20 1615 1211 0) 


31 2B 
[Conds ]oifpjuayyy] rn | Ra | operand2 | 


= 


Condition 
Code 


| | 
Source/Destination Register 
Base Register 


Load/Store: 0= STR, 1=LDR 
Write-back bit 

0 = no write-back 

1 = Write addrees back into base (/). 


Byte/Word bit 
O = word transfer 


1 = byte transfer (B) 


Pre/Post Indexing 


O= 
1= 


Immediate Value 


O = Operand 2 is a register. 


1 = Operand 2 is an 
immediate value. 


post: [base},index 
pre: {[base,index] 


Imm=1 --> Operand 2 is an immediate value. 
Up/Down bit 11 


O = offset is negative 
1 = offset is positive 


0 


Unsigned 12-bit value 


Imm=0 --> Operand2 is in a register. 
11 76543 


0 


Shift Amount | 


Shift amount is a 5-bit 
shift count, to be applied 
to the Rm register. 


= 2nd-Operand register 


Shift Type 
00 = Logical Left 
01 = Logical Right 
10 = Arithmetic Right (ASR) 
11 = Rotate Right 


(LSL) 
(LSR) 


(ROR) 


Note: There is no Rs for of shift for the LDR/STR class. That is, the shift amount cannot be contained in a register. 


PC together with the PSR. 


When R15 is the source register (Rd) of 
a register store (STR) instruction, the 
value stored will be the PC together 
with the PSR. The stored value of the 
PC will be 12 bytes on from the address 
of the instruction. A load register (LDR) 
with R15 as Rd will change only the PC, 


and the PSR will be unchanged. 


Data Aborts - A transfer to or from a 
legal address may still present special 
cases for a memory management 
system. For instance, in a system 
which uses virtual memory, the required 
data may be absent from main memory. 
The memory manager can signal a 
problem by taking the processor ABRT 


pin high, whereupon the data transfer 
instruction will be prevented from 
changing the processor state, and the 
data abort trap will be taken. It is up to 
the system software to resolve the 
cause of the problem. The instruction 
can then be restarted and the original 
program continued. 
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Syntax: 
LDR/STR{cond}{B}{T} Rd,<Address>{!} 
where LDR means Load from memory into a register. 
STR means store from a register into memory. 
cond is a two-character condition mnemonic (see Condition Code section). 
B lf present implies byte transfer, else a word transfer. 
T lf present, the W bit is set in a post-indexed instruction, causing the 


-TRAN pin to go low for the transfer cycle. T is not allowed when a pre-i 
indexed addressing mode is specified or implied. 

Rd is a valid register: RO-R15, SP, LK, or PC. 

Address Can be any of the variations in the following table. 


Address Variants: 
Address expression: An expression evaluating to a relocatable address: 
<expression> § The assembler will attempt to generate an instruction using the PC 
as a base, and a corrected offset to the location given by the 
expression. This is a PC-relative pre-indexed address. ff out of range 
(at assembly or link time), an error message will be given. 


Pre-indexed address: Offset is added to base register before using as effective address, and 
offsets are placed within the [ ] pair. Rn may be viewed as a pointer: 


[Rn}{!} No offset is added to base address pointer. 
[Rn, <expression>]{ |} Signed offset of expression bytes is added to base pointer. 
[Rn, Rm\{} Add Rm to Rn before using Rn as an address pointer. 


[Rn, Rm {<shift> count} {!} Signed offset of Am (modified by shiff) is added to base pointer. 


Post-indexed address: Offset is added to base reg, after using base reg for the effective address. 
Offsets are placed after the [ ] pair: 


[Rn],<expression>{ }} Expression is added to Rn, after Rn’s usage as a pointer. 

[Rn}, Rm{}} Rm is added to Rn, after Rn’s usage as an address pointer. 

[Rn], Rm <shift> count{!} Shift the offset in Rm by count bits, and add to Rn, after 
Rn’s usage as an address pointer. 


where expression A signed 13-bit expression (including the sign). 
Rm, Rn A valid register names: RO-R15, SP, LK, or PC. If RN = PC, the assembler 
will subtract 8 from the expression to allow for processor address readahead. 
shift Any of: LSL, LSR, ASR, ROR, or RRX. 
count Amount to shift Rm by. It is a §-bit constant, and may not be 
specified as an Rs register (as for some other instruction classes). 
If present, the | sets the W-bit in the instruction, forcing the 


effective offset to be added to the Rn register, after completion. 


Examples (Pre-index): 
In each of these examples, the effective offset is added to the Rn (base pointer) register prior to using the Rn register as the 
effective address. Rn is then updated only ff the | suffix is supplied. 


STR R1, (R2, R1)! ; *(R2+R1) =R1. Then R2+= Ri. 

STR R3, (R2] > *(R2) = R3. 

LDR Ri, (RO, 16) ;R1 = *(RO + 16). Don’t update RO. 

LDR R9, [R5, RO LSL 2] ; RO = *(R5 + (R2<<2)). Don't update RS. 
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LDREQB R2, [RS, 5] : it (Zflag) R2 = *(RS +5), a zero-filled byte load. ( 


Examples (Post-index): 
In each of these examples, the effective offset is added to the Rn (base pointer) register after using the Rn register as the 
effective address. Rn is then updated unconditionally, regardless of any | suffix. 


STR R1, (R2], R1! ;*R2=R1. Then R2 += Ril. 

STR R3, [R2], R5! ; *(R2) = R3. Then R2 += R5. 

LOR R1, [RO], 16 ;R1 = °RO. Then RO += 16. 

LDR R9, (R5], RO ASR 3 ;R9=°*R5. Then R5 += (RO / 8). 

LDREQB R2, (R5}, 5 ; if (Zflag) R2 = *R5, a zero-filled byte load, and then R5 += 5. 
Examples (Expression): 


In these examples, the PLACE label is an internal or external PC-relative label, typically created as shown. PC-relative refer- 
ences are precompensated for the 8-byte read-ahead done by the processor. PARM«x is a register-relative label, typically created 
via a DTYPE directive, and assumed to be relative to the LK (R14) register. DATAx is similar, but is presumably defined relative 
to the SP (R13) register, and GENERAL relative to RO. In any case, they may be located up to +4096 bytes from the associated 
base register. 


LDR RO, DATA1 ; SP-relative. Same as: LDR RO, [SP+DATA1]. 
STR R2, PLACE ; PC-relative. Same as: STR R2, [PC+16]. 
LOR | R1, PARMO ; LK-relative. Same as: LDR Ri, (LK+DATA1]. 
STR R1, GENERAL ; RO-relative. Same as: STR Ri, (RO+GENERAL]. 
B Across ; Skip over the data temporary. 
PLACE DW 0 ; Temporary storage area. 
Across «es ; Resume execution. 
FIGURE 17. LOAD/STORE REGISTER LIST FROM MEMORY (LDM,STM) ( 
31 2827 25 2019 1615 0 


[Condx 10 gPiUSWL] An | Register List 


T iw Base Flapister 


Load/Store: 0= STM, 1=LDM 


Condition 
Code Write-back bit 
0 = no write-back 
1 = Write address back into base (!). 
PSR & Force-User bit (4 suffix) 
0 = Do not load PSR or force user mode. 
1 = Load PSR, and optionally force user mode (“). 
Up/Down Bit Pre/Post Indexing Form 
0 = offset is negative 0 = Post: [base],index 
1 = offset is positive 1=Pre: [base,index] 
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The multi-register transfer instructions 
are used to load (LDM) or store (STM) 
any subset of the currently visible 
registers. They support all possible 
stacking modes (push up/pop down, or 
push down/pop up). They are very 
efficient instructions for saving or 
restoring context, or for moving large 
blocks of data around main memory. 


The Register List - The instruction can 
cause the transfer of any registers in 
the current bank (and non-user mode 
programs can also transfer to and from 
the user bank). The register list is 
contained in a 16-bit field in the 
instruction, with each bit corresponding 
to a register. A logic one in bit zero of 
the register field will cause RO to be 
transferred, a logic zero will cause it not 
to be transferred; similarly bit one 
controls the transfer of R1, and so on. 


Addressing Modes - The transfer 
addresses are determined by the 
contents of the base register (Rn), the 
pre/post bit (P) and the up/down bit (U). 
The registers are transferred in the 
order lowest to highest, so R15 (if in 
the list) will always be transferred last. 
The lowest register also gets trans- 
ferred to/from the lowest memory 
address. This is illustrated in Figures 
18 and 19. 


Transfer of R15 - Whenever R15 is 
stored to memory, the value transferred 
is the PC together with the PSR flags. 
The stored value of the PC will be 12 
bytes advanced from the address of the 
STM instruction. 


If R15 is in the transfer list of a load 
multiple (LDM) instruction the PC is 
overwritten, and the effect on the PSR 
is controlled by the S bi. # the S bit is 
zero the PSR is preserved unchanged, 
but if the S bit is set the PSR will be 
overwritten by the corresponding bits of 
the loaded value. In user mode, 
however, the |, F, M1, and MO bits are 
protected from change, whatever the 
value of the S bit. The mode at the start 
of the instruction determines whether 
these bits are protected, and the 
supervisor may return to the user 
program, reenabling interrupts and 
restoring user mode with one LDM 
instruction. 


Transfers to User Bank - For STM 
instructions the S bit is redundant as the 
PSR is always stored with the PC 
whenever R15 is in the transfer list. In 
user mode the S bit is ignored, but in 
other modes it has a second interpreta- 
tion. S = 1 is used to force transfers to 
take values from the user register bank 
instead of from the current register 
bank. This is useful for saving the user 
state on process switches. Note that 
when it is so used, write back of the 
base will also be to the user bank, 
though the base will be fetched from the 
current bank. Therefore don’t use write 
back when forcing user bank. 


In LDM instructions the S bit is redun- 
dant if R15 ts not in the transfer list, and 
again in user mode it is ignored. In 
non-user mode where R15 is not in the 
transfer list, S=1 is used to force loaded 
values into user registers instead of the 
current register bank. When used in 
this manner, care must be taken not to 
read from a banked register during the 
following cycle; if in doubt, insert a no- 
op. Again, don’t use write back when 
forcing a user bank transfer. 


R15 as the Base - When the base is the 
PC, the PSR bits will be used to form 
the address as well. Aliso, write back is 
never allowed when the base is the PC 
(setting the W bit will have no effect). 


Base Within the Register List - When 
write back is specified, the base is 
written back at the end of the second 
cycle of the instruction. During a STM, 
the first register is written out at the start 
of the second cycle. A STM which 
includes storing the base, with the base 
as the first register to be stored, will 
therefore store the unchanged value, 
whereas with the base second or later 
in the transfer order, will store the 
modified value. An LDM will always 
overwrite the updated base if the base 
is in the list. | 


Abort During an STM - if the abort 
occurs during a store multiple instruc- 
tion, the processor takes little action 
until the instruction completes, where- 
upon it enters the data abort trap. The 
memory manager is responsible for 
preventing erroneous writes to the 


memory. The only change to the 
internal state of the processor will be 
the modification of the base register if 
write back was specified, and this must 
be reversed by software (and the cause 
of the abort resolved) before the 
instruction may be retried. 


To illustrate the various load/store 
modes, consider the transfer of R1, R5 
and R’7 in the case where Rn = 1000H 
and write back of the modified base is 
required (W = 1). These figures show 
the sequence of register transfers, the 
addresses used, and the value of Rn 
after the instruction has completed. 


In all cases, had write back of the 
modified base not been required (W=0), 
Rn would have retained its initial value 
of 1000H unless it was also in the 
transfer list of the load multiple register 
instruction. Then it would have been 
overwritten with the loaded value. 


Aborts During LDM - When the 
processor detects a data abort during a 
load multiple instruction, it modifies the 
operation of the instruction to ensure 
that recovery is possible. 


Overwriting of registers stops when the 
abort happens. The aborting load will 
not take place, nor will the preceding 
one, but registers two or more positions 
ahead of the abort (if any) will be 
loaded. (This guarantees that the PC 
will be preserved, since it is always the 
last register to be overwritten.) 


The base register is restored, to its 
(modified) value if write back was 
requested. This ensures recoverability 
in the case where the base register is 
also in the transfer list, and may have 
been overwritten before the abort 
occurred. 


The data abort trap is taken when the 
load multiple has completed, and the 
system software must undo any base 
modification (and resolve the cause of 
the abort) before restarting the instruc- 
tion. 
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The following figures illustrate the write-back of the modified base is done Without writeback, Rn would remain at 
impact of various addressing modes. (W=1). The figures show the sequence 0x1000. 
R1, R5, and R7 are moved to/from of incrementing “pushes”, the ad- ' , ,; 
sy Figure 19 illustrates decrementing 
memory, where Rn=0x1000, and a dresses used, and the final value of Rn. "pushes" to the stack based upon Rn. 
FIGURE 17. INCREMENTING INDEX FIGURE 18. DECREMENTING INDEX 
Poet-increment Addressing Post-Decremem Addressing 
0x100C | | (x10 
feet see 
Fin ra 0x1000 | 0xto00 
ee ee 
a; | aes 
OxOFF4 | OxOFF4 
(1) Before STM Instruction (2) After First Transfer 


0x100C 


ee 
Saws 


an 
-—_] 0x1000 


o 
be 
g 


a 
LS OXOFF4 Fn Ox0FF4 
(3) After Second Transfer (4) STM instruction Complete (3) After Second Transfer (4) After STM instructon Complete 
Pre-increment Addressing Pre-Decremen Addressing 
0x100C 0x100C 0x100C | x00 
Sa es 
Pe eee eel 
Rn — > 0x1000 0x1000 Rn —> 0x1000 | ss 0x 1000 
ee 
a 
Ox0FF4 Ox0FF4 Ox0FF4 | XO FFA 
(1) (1) (2) 
| is(Oxt00€ 0x100C ps xt00c Fn oxt00c 
ae. ae aes aes 
| «1000 0x1000 _CC*:«£x 1000 | «1000 
eae areas 
aes LORS 
a OxOFF4 | RT (OXOFF4 RI s«(OxOFF4 
(3) (3) (4) 
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Syntax: 


LDM|STM{cond}<mode> Rn{!}, <Rlist>{4} 


cond 
mode 


where 


Is an optional 2-letter condition code common to all instructions. 
Is any of: FD, ED, FA, EA, IA, IB, DA, or DB. 
Rn Is a valid register name: RO-R15, SP, LK, or PC. 


Rlist Can be a single register (as described above for Rn), or may be a list of 
registers, enclosed in { } (eg {RO,R2,R7-R10,LK}). 

! lf present, requests write back (W=1). Otherwise W=0. 

A If present, set S bit to load the PSR with the PC, or force transfer of user 


bank, when in non-user mode. 


Addressing Mode Names - There are different assembler mnemonics for each of the addressing modes, depending on whether 
the instruction is being used to support stacks, or for other purposes. The names may be used interchangeably: e.g., LOMED 
performs exactly the same as LDMIB. The name equivalences and instruction bit values are: 


Pre-increment load 
Post-increment load 
Pre-decrement load 
Post-decrement load 


Pre-increment store 
Post-increment store 
Pre-decrement store 
Post-decrement store 


Use as 


Stack 
LDMED 


LDMFD 
LDMEA 
LDMFA 


STMFA 
STMEA 
STMFD 
STMED 


Other 


usages 
LDMIB 


LDMIA 
LDMDB 
LDMDA 


STMIB 
STMIA 
STMDB 
STMDA 


LBit PBit 
1 1 
1 0 
1 1 
1 0 
0 1 
0 ) 
0 1 
0 0 


U 
1 
1 
0 
0 
1 
1 
0 


0 


Operation 


Pop upwards 
Pop upwards 
Pop downwards 
Pop downwards 


Push upwards 
Push upwards 
Push downwards 
Push downwards 


FD, ED, FA, EA indicate whether or not the addressed memory ceil has valid data in it (from the previous push or pop), and which 
direction the stack is to flow. They define the settings of the L, P, and U bits, based on the form of stack required. 


The F and E refer to a “full” or “empty” stack cell. The A and D refer to whether the stack is ascending or descending. If ascend- 
ing, a STM will go up and LDM down, if descending, vice-versa. 


IA, |B, DA, DB allow control when LDM/STM are not being used for stacks and simply mean Increment After, Increment Before, 


Decrement After, Decrement Before. 


Examples 
LDMFD 


STMIA 


STMED 
BL 
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SP!, (RO, R1, R2}; unstack 3 registers 
BASE, {RO, R15} ; save all registers 
These instructions may be used to save state on subroutine entry, and restore it efficiently on return to the calling routine; 


SP!, {RO-R3, LK} ; Save RO to R3 for workspace,and R14 for returning. 
; This call will overwrite R14 


LDMED SPI, {RO-R3, PC} ; Restore workspace and return, restoring PSR flags. 


Subroutine 
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FIGURE 20. SOFTWARE INTERRUPT (SW) 
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31 2827 2423 0 


| Condx [tt tt Instruction to executive (ignored by AR 


Note: The machine comments field in bits 23:0 are ignored by the hardware. They are made available for free interpretation by 
the software executive, and may be found in LSB-first byte order on the stack. 


The Software Interrupt (SWI) instruction Return from the Supervisor - The PC itself it must first save a copy of the 

is used to enter segeohianl mode in a and PSR = shat in R14_svc upon return address. 

controlled manner. The instruction entering the are interrupt trap, with 

causes the software interrupt trap to be the PC adjusted to point to the word Machine Comments Fleid . The 
taken, which effects the mode change, after the SWI instruction. MOVS R15, bottom 24 bits of the instruction are 
with execution resuming at 0x08. fthis | R14 svcwill return to the user program,  ‘9nored by the processor, and may be 
address is suitably protected (by restore the user PSR and return the used to communicate with the supervi- 
external memory management hard- processor to user mode. sor code. For instance, the supervisor 


ware) from modification by the user, a may extract this field and use it to index 


; Note that the link mechanism is not re- into an array of entry points for routines 
anid ena operating system may b@ entrant, so if the supervisor code which perform various supervisor 
; wishes to use software interrupts within functions. 
Syntax: 
SWI{cond} <expression> 
where cond is the two-character condition code common to all instructions. 
expression Is a 24-bit field of any format. The processor itself ignores it, but the 
typical scenario is for the software executive to specify patterns in it, 
which will be interpreted in a particular way by the executive, as commands. 
Examples: 
acons Zero=0, ReadC=1, Write1=2 : Assembler constants. 
SWI ReadC ; Get next character from read stream 
Swi Writel+"k” ; Output a “k” to the Write stream 
SWINE 0 ; Conditionally call supervisor with 0 in comment field 


The above examples assume that suitable supervisor code exists. For instance: 
; Assume that the R13_sve (the supervisor's R13) points to a suitable stack. 


acons Zero=0, ReadC=1, Write1=2 ; Assembler constants. 
acons CC_Mask = 0xFC00003 ; Non-address area mask. 
08h B Super ; SWI entry point 
Super STMFD SPI,{r0,r 1, r2) ; Save working registers. 
BIC ri, r14, CC_Mask; Strip condx codes from SWI instruction sddrese: 
LOR RO, [R1, -4] ; Get copy of SWI instruction. 
BIC RO, RO, OxFFO00000 ; Get lower 24 bits of SWI, only. 
MOV R1, SWI_Table ; Get absolute address of PC-relative table. 
LDR PC, [R1, ROLSL 2] ; Jump indirect on the table. 
SWI_Table dw Zero Action ; Address of service routines. 


dw ReadC_Action 
dw Write1_ Action 


Write1_Action + Typical service routine. 


LDM ~=R13,{RO-R2, PC}* ; Restore workspace, and return to inst after SWI. 
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FIGURE 21. COPROCESSOR DATA OPERATIONS (CPD) 
31 2827 2423 2019 1615 1211 87 §43 0 
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es 
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Condition Coprocessor Coprocessor Operand 
Code Operation Registers 
Code Coprocesser Auxiliary 
Coprocessor Information 
Destination 
Register Coprocessor Number 
The instruction is executed only if the to complete. The coprocessor could any or all fields as appropriate, except 
condition code field is true. The field is maintain a queue of such instructions for the CP#. For the sake of future 
described in the Condition Codes awaiting execution. Their execution family product introductions, it is 
section. may then overlap other APRM activity, encouraged that the above conventions 
This is actually a class of instructions, allowing the two processors to perform be followed, unless absolutely neces- 
rather than a single instruction, and is independent tasks in parallel. ay: 
equivalent to the ALU class on the Coprocessor Fields - Only bit 4 and By convention, the coprocessor should 
APRM. Ail instructions in this class are bits 31:24 are significant to the APRM; perform an operation specified in the 
used to direct the coprocessor to the remaining bits are used by CP Opc field (and possibly in the CP 
perform some internal operation. No coprocessors. The above field names field) on the contents of CRn and CRm, 
result is sent back to the APRM, and are used by convention, but particular placing the result into CRd. 
the APRM will not wait for the operation coprocessors may redefine the use of 
Syntax: 
CPD{cond} CP#,<expression1>, CRd, CRn, CRm{,<expression2>} 
where cond Is the conditional execution code, common to all instructions. 
CP# Is the (unique) coprocessor number, assigned by hardware. 


CRd, CRn, CRm These are valid coprocessor registers: CRO-CR15. 
expression! Evaluates to a constant, and is placed in the CP Opc field. 
expression2 (Where present) evaluates to a constant, and is placed in the CP field. 


Examples: 
CDP ‘1, 10, CR1, CR7, CR2 ; Request co-proc #1 to do operation 10 on CR7 and CR2, putting result into CR1. 


CDPEQ 2, 5, CR1, cr2, Cr3, 2 ; If the Z flag is set, request co-proc #2 to do 
; operation 5 (type 2) on CR2 and CR3, placing the result into CR1. 


FIGURE 22. COPROCESSOR LOAD/STORE DATA (LDC/STC) 


2423. 2019 1615 1211 87 543 O 


ea Vail [es [ee ee 


|e Sse [= ‘St 


Condition 

Code 
Index Control Sere a haar 

ra ae Register Load/Store Bk 

Writeback 0 = Store to Memory 

Up/Down 0 = No writeback 1 = Load to Coproc Reg 

0 = Subtract 1 = Write @.a. to Rn. 

1 = Add Offset Transfer Length 
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The LDC and STC instructions are used 
to load or store single bytes or words of 
data. They differ from MCR and MRC 
instructions in that they move data 
between coprocessor registers and a 
specified memory address. In contrast, 
the other instructions move data 
between registers, or move a constant 
(contained in the instruction) into a 
register. 


The memory address used in LDC/STC 
transfers is calculated by adding an 
offset to or subtracting an offset from a 
base pointer register, Rn. Typically, a 
load of a labeled memory location 
involves the loading via a (signed) offset 
from the current PC. Regardless of the 
base register used, the result of the 
offset calculation may be written back 
into the base register if ‘auto-indexing’ 
is required. 


Coprocessor Fields - The CP# field 
identifies which coprocessor shall 
supply or receive the data. A coproces- 
sor will respond only ff its number 
matches the contents of this field 


The CRd field and N bit contain 
information which may be interpreted in 
different ways by different coproces- 
sors. By convention, however, CRd is 
the register to be transferred (or the first 
register, where more than one is to be 
transferred). The N bit is used to 
choose one of two transfer length 
options. For instance, N=0 could select 


Syntax: 


the transfer of a single register, and 
N=1 could select the transfer of all 
registers for context switching. 


Offsets and Indexing - The APRM is 
responsible for providing the address 
used by the memory system for the 
transfer, and the modes available are 
similar to those used for the APRM’s 
LDR/STR instructions. 


Only 8-bit offsets are permitted, and the 
APRM automatically scales them by two 
bits to form a word offset to the pointer 
in the Rn register. Of itself, the offset is 
an 8-bit unsigned value, but a 9-bit 
signed negative offset may be supplied. 
The assembler will complement ft to an 
8-dit (positive) value and will clear the 
instruction’s U bit, forcing a compensat- 
ing subtract. The result is a +256 word 
(1024 byte) offset from Rn. Again, the 
APRM internally shifts the offset left two 
bits before addition to the Rn register. 


The offset modification may be per- 
formed either before (pre-indexed, P=1) 
or after (post-indexed, P=0) the base is 
used as the transfer address. The 
modified base value may be written 
back into the base (W=1), or the old 
base value may be kept (W=0). In the 
case of post-indexed addressing, the 
write back bit is redundant, since the old 
base value can be retained by setting 
the offset to zero. Therefore post- 
indexed data transfers always write 
back the modified base. 


<LDC/STC>{cond}{L}{T} CP#, CRd, <Address>{I} 


For an offset of +1, the value of the Rn 
base pointer register (modified, in the 
preindexed case) is used for the first 
word transferred. Should the instruction 
be repeated, the second word will go 
fromfo an address one word (4 bytes) 
higher than than pointed to by the 
original Rn, and so on. 


Use of R15 - If R15 is specified as the 
base register (Rn), the PC is used 
without the PSR flags. When using the 
PC as the base register note that it 
contains an address eight bytes 
advanced from the address of the 
current instruction. As with the LDR/ 
STR case, the assembler performs this 
compensation automatically. 


Hardware Address Transiation - The 
W bit may be used in non-user mode 
programs (when post-indexed address- 
ing is used) to force the -TRANS pin low 
for the transfer cycle. This allows the 
operating system to generate user 
addresses when a suitable memory 
management system is present. 


Data Aborts - if the address is legal but 
the memory manager generates an 
abort, the data abort trap will be taken. 
The writeback of the modified base will 
take place, but all other processor state 
data will be preserved. The coproces- 
sor is partly responsible for ensuring 
restartability. it must either detect the 
abort, or ensure that any actions 
consequent from this instruction can be 
repeated when the instruction is retried 
after the resolution of the abort. 


where LDC means Load from memory into a coprocessor register. 

STC means store a coprocessor register to memory. 

cond is a two-character condition mnemonic (see Condition Code section). 

L lf present implies long transfer (N=1), else a short transfer (N=0). 

T if present, the W bit is set in a post-indexed instruction, causing the 
—TRAN pin to go low for the transfer cycle. T is not allowed when a pre-i 
indexed addressing mode is specified or implied. 

CP# Valid coprocessor number, determined by hardware. 

CRd Valid coprocessor register number: CRO-CR15. 


Address Can be any of the variations in the following table. 
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Address Variants: 
Address expression: An expression evaluating to a relocatable address: 


<expression> § The assembler will attempt to generate an instruction using the PC 
as a base, and a corrected offset to the location given by the 
expression. This is a PC-relative pre-indexed address. If out of range 
(at assembly or link time), an error message will be given. 


Pre-indexed address: Offset is added to base register before using as effective address, and 
offsets are placed within the [ ] pair. Rn may be viewed as a pointer: 
[Rn]{!} No offset is added to base address pointer. 
[Rn, <expression>] Signed offset of expression bytes is added to base pointer. 


[Rn, <expression>{ }} Signed offset of expression bytes is added to base pointer. Then 
this effective address is written back to Rn. 


Post-indexed address: Offset is added to base reg, after using base reg for the effective 
address. Offsets are placed after the [ ] pair: 


[Rn],<expression> Expression is added to Rn, after Rn’s usage as a pointer. 
where expression A signed 13-bit expression (including the sign). 
Rin Avalid register names: RO-R15, SP, LK, or PC. If RN =PC, the 


assembler will subtract 8 from the expression to allow for processor 
address readahead. 


Examples (Pre-index): 
In each of these examples, the effective offset is added to the Rn (base pointer) register prior to using the Rn register as the 
effective address. Rn is then updated only if the ! suffix is supplied. Coprocessor #1 is used in all cases, for simplicity. 


STC 1,CR3, [R2] ; *(R2) = CR3. 

LOC 1,CR1, (RO, 16] ; CR1 = °(RO + 16). Don’t update Ro. 

LDCEQ 1,CR2, (RS, 12]! ; if (Zflag) CR2 = *(A5 + 12). Then, RS += 12. 
Examples (Post-index): 


In each of these examples, the effective offset is added to the Rn (base pointer) register after using the Rn register as the 
effective address. Rn is then updated unconditionally, regardless of any ! suffix. Coprocessor #3 is used in all cases, for 


simplicity. 
STC 3, CR1, [R2], R1! ;*R2=CR1. Then R2 += Ri. 
LOC 3, CR1, [RO], 16 ;CR1 = “RO. Then RO += 16. 
LOCEQL 3, CR2, [R5], 4 ; if (Zflag) CR2 = “RS, and then (implicitly), R5 += 4. 
- Use the long option (probably to store multiple words). 
Examples (Expression): 


In these examples, the PLACE label is an internal or external PC-relative label, typically created as shown. PC-relative refer- 
ences are precompensated for the 8-byte read-ahead done by the processor. It may be located up to +1024 bytes from the 
associated base register, and must be a multiple of 4 bytes in offset. 


STC R2, PLACE ; PC-relative. Same as: STC R2, [PC+8]. 
B Across ; Skip over the data temporary. 

PLACE DW 0 ; Temporary storage area. 

Across °° ; Resume execution. 
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FIGURE 23. COPROCESSOR REG TRANSFER (MCR,MRC) 


31 2827 «42423 21 19 +1615 1211 


87 543 


Condition  Coprocessor ARM ieee Sala 
Operation Sre/Dst 
Code Register Coprocesser Auxiliary 
| Information 
Load/Store Bit Coprocessor Number 
0 = Store to co-proc 


The instruction is executed only if the 
condition code field is true. The field is 
described in the Condition Codes 
section. 


This is actually a class of instructions, 
rather than a single instruction, and is 
equivalent to the ALU class on the 
APRM. Instructions in this class are 
used to direct the coprocessor to 
perform some operation between an 
APRM register and a coprocessor 
register. It differs from the CPD 
instruction in that the CPD performs 
operations on the coprocessor’s internal 
registers only. 


An example of an MCR usage would be 
a FIX of a floating point value held in 
the coprocessor, where the number is 
converted to a 32-bit integer within the 
coprocessor, and the result then 
transferred back to an APRM register. 
An example of an MRC usage would be 


Syntax: 


the converse: A FLOAT of a 32-bit 
value in an APRM register into a 
floating point value within a coprocessor 
register. 


An important use of this instruction is to 
communicate control information 
directly from the coprocessor into the 
APRM PSR flags. As an example, the 
result of a comparison of two floating 
point values within the coprocessor can 
be moved to the PSR to control 
subsequent execution flow. 


Coprocessor Fields - The CP# field is 
used by all coprocessor instructions to 
specify which coprocessor is being 
invoked. 


The CP Ope, CRn, CP, and CRm fields 
are used only by the coprocessor, and 
the interpretation of these fields is set 
only by convention; other incompatible 
interpretations are allowed. The 


MCR/MRC{cond} CP#,<expression1>, Rd, CRn, CRm{,<expression2>} 


conventional interpretation is that the 
CP Opc and CP fields specify the 
operation for the coprocessor to 
perform, CRn is the coprocessor 
register used as source or destination of 
the transferred information, and CRm is 
the second coprocessor register which 
may be involved in some way depend- 
ent upon the operation code. 


Transfers To/From R15: When a 
coprocessor register transfer to APRM 
has R15 as the destination, bits 31:28 
of the transferred word are copied into 
the N, Z, C, and V flags, respectively. 
The other bits of the transferred word 
are ignored; the PC and other PSR 
flags are unaffected by the transfer. 


A coprocessor register transfer from 
APRM with Rit as the source register 
will save the PC together with the PSR 


flags. 


where cond ls the conditional execution code, common to all instructions. 

CP# Is the (unique) coprocessor number, assigned by hardware. 

Rd Is the APRM source or destination register. 

CRn, CRm These are valid coprocessor registers: CRO-CR15. 

expression! Evaluates to a constant, and is placed in the CP Opc field. 

expression2 (Where present) evaluates to a constant, and is placed in the CP field. 
Examples: 

MCR ‘1, 10, R1, CR7, CR2 - Request co-proc #1 to do operation 10 on 


MRCEQ 2, 5, R1, cr2, Cr3, 2 
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- CR7 and CR2, putting result into APRM’s R11. 


: If the Z flag is set, transfer the APRM’s R11 reg to the co-proc register (defined 
by hardware), and request co-proc #2 to do oper 5 (type 2) on CR2 and CR3. 
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FIGURE 24. UNDEFINED (RESERVED) INSTRUCTIONS 
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Note: The above instructions will be presented for execution only if the condition field is true. 


lf the condition is true, the Undefined Assembler Syntax - At present the INSTRUCTION SET SUMMARY 
Instruction trap will be taken. assembler has no mnemonics for The following examples show ways in 
generating these instructions. If they which the basic processor instructions 


Note that the undefined instruction 


wee are adopted in the future for some can combine to give efficient code. 
dipeohelagl Hipaiboaciioll hihegrim specified use, suitable mnemonics will None of these methods saves a great 
may be present | 5 coprocessors be added to the assembler. Until such deal of execution time (although they 
must refuse to.accept them by taking ie these instructions should not be oe some), mostly they just save 


CPA high. 


Using Conditional Instructions - 
(1) Using conditionals for logical OR, this sequence: 


CMP Ri,p ; If Ri=p or R2=q then goto Label 
BEQ Label 
CMP R2,q 
BEQ Label 
can be replaced by 
CMP Ri,p 
CMPNE Rm, q ; If condition not satisfied try other test 
BEQ Label 
(2) Absolute value 
TEQ R1,0 ; Test sign 
RSBMI R1,R1,0 ; and 2's complement if necessary 
(3) Multiplication by 4, 5 or 6 (run time) 
MOV R2, ROLSL 2 ; Multiply by 4 
CMP R1i,5 | ; Test value 
ADDCS R2, R2, RO ; Complete multiply by 5 
ADDHI R2, R2, RO ; Complete multiply by 6 
(4) Combining discrete and range tests 
TEQ R2, 127 ; Discrete test 
CMPNE R2, * *-1 ; Range test 
MOVLS R2, ”,” ; The, R2 =”.” 
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Division and Remainder 


- Enter with numbers in RO and R1 


MOV R4, 1 

Div1 CMP R1, 0x80000000 
CMPCC R11, RO 
MOVCC R1, R1LSL 1 
BCC Div1 
MOV R2, 0 

Div2 CMP RO, R1 
SUBCS RO,RO,R1 
ADDCS R2, R2, R4 
MOVS R2nt, R4 LSR 1 
MOVNE R1, R1 LSR 1 
BNE Div2 


- Division resuk is in R2. 
- Remainder is in RO. 


FIGURE 25. INSTRUCTION SET SUMMARY 


31 2827 2423 2019 


[Conds |0 1] 1}Opcode|s|_An | Rd _|  Operand2 
Conéx Jooooo0 jajs| Ad | An | Rs |i oo 4 Am_ 
AX RASS 


[Condy |0.0 0 IX XXXXXXXX XXX XXX 
Cond }o 1} i|P|ujahyt] An | Rd | Oftset (variants) 


UR XXUMXXNAKKARKK AKERS 


Cond 
[Condx |1 0 ofPIUIBMML| Rn _| R15 <--—~ Register List —~> RO 
Pace | a2? 


Condy |1 1 ofP|UINMML] An | CRd | CP# | Offset 
[Conde Lorn | crd | cee | oP lol crm 
[Condx |111 0] Opell cm | Ad | cre | oP |i] crm 
[Condy ]1111 


Bit space ignored by processor 


; Bit to control the division 
; Move R11 until greater than RO 


; Test for possible subtraction 
: Subtract if ok 

; Put relevant bit into result 

; Shift control bit 

; Halve unless finished 


1615 1211 87 


Word address offset 


43 


0 
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ie 


Data Processing 
Multiply 

Undefined 

Load, Store 

Undefined 
Multi-Register Transfer 
Branch, Call 

Coproc Data Transfer 
Coproc Data Opr | 
Coproc Register Transfer 
Software Interrupt 
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Pseudo Random Binary Sequence like a cyclic redundancy check genera- bit_20, shift left the 33 bit number and 
Generator - It is often necessary to tor. Unfortunately the sequence of a 32 put in Newbit at the bottom. Then do 
generate (pseudo-) random numbers bit generator needs more than one this for all the Newbits needed i.e. 32 of 
and the most efficient algorithms are feedback tap to be maximal length (i.e. them. Luckily, this can all be done in 
based on shift register-based genera- 2°32-1 cycles before repetition). The 5S cycles: 

tors with exclusive or feedback rather basic algorithm is Newbit = bit_33 xor 


: Enter with seed in RO (32 bits), R1 (1 bit in R1 Isb) 


; Uses R2 
TST Ri, R1LSR1 ; Top bit into carry 
MOVS' R2, RO RRX ; 33 bit rotate right 
ADC Ai, Ri, R1 ; Carry into Isb of R1 
EOR  R2, R2, ROLSL 12 ; (Involved!) 
EOR  RO0,R2,R2 LSR 20 ; (Whew!) 


- New seed in RO, R1 as before 


Multiplication by Constant: 
(1) Multiplication by 2’n (1,2,4,8,16,32..) 
MOV RO, ROLSLn 


(2) Multiplication by 24n+1 (3,5,9,17..) 
ADD RO, RO, ROLSLn 

(3) Multiplication by 24n-1 (3,7,15..) 
RSB ~~ RO, RO, ROLSLn 


(4) Multiplication by 6 
ADD RO, RO, ROLSL 1 ; Multiply by 3 
ADD RO, ROLSL 1 sand then by 2 
(5) Multiply by 10 and add in extra number 
ADD RO, RO, ROLSL 2 ; Multiply by 5 
MOV RO, R2, ROLSL 1 ; Multiply by 2 and add in next digit 


(6) General recursive method for R1 =R0*C,C a constant: 
(a) If C even, say C = 24n*D, Dodd: 


D=1: MOV Ri1,ROLSLn 
Deo1: (Ri =RO*D) 
MOV R1, R1LSLn 


(b) If C MOD 4 = 1, say C = 24n*D+1, D odd, N>1: 
D=1: ADD Ri, RO, ROLSLn 


D<a1: (R1 = RO0*D) 
ADD Ai, RO, R1LSLn 


(c) ff C MOD 4 =» 3, say C = 24n"D-1, D odd, n>1: 


Det: RSB Ri, RO,ROLSLn 
Doi: (R1 =RO'D) 
RSB R1,RO,R1LSLn 


This is not quite optimal, but close. An example of its non-optimality is multiply by 45 which is done by: 


RSB =~ Ri, RO, ROLSL2 ; Multiply by 3 

RSB = Ri, RO, R1LSL2 ; Multiply by 4°3-1 = 11 

ADD Ril, RO, R1 LSL2 ; Multiply by $°11+1 = 45 
rather than by: 

ADD Ri, RO, ROLSL3 ; Multiply by 9 

ADD ARi1,R1,R1LSL2 ; Multiply by 5°9 = 45 
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Loading a Word with Unknown Alignment: 
: Enter with address in RO (32 bits) 

- Uses R1, R2; resutt in R2. 

; Note R2 must be less than R3, e.g. 2, 3 


BIC R1, RO, 3 ; Get word aligned address. 

LDMIA R1, {R2,R3} ; Get 64 bits containing answer. 

AND R1, RO, 3 ; Correction factor in bytes, not in bits. 

MOVS R1,R1LSL3 ; Test if aligned. 

MOVNE R2, R2, LSR Rt ; Product bottom of result word (if not aligned). 

RSBNE Ri, R1, 32 ; Get other shift amount. 

ORRNE R2, R2, R3 LSL R1 ; Combine two halves to get result. 
Sign Extension of Partial Word 

MOV RO, ROLSL 16 ; Move to top 

MOV RO, RO, LSR 16 ; ... and back to bottom 

; (Use ASR to get sign extended version). 

Return,Setting Condition Codes 

BICS PC, R14,CFLAG ; Returns, clearing C flag rom link register. 

ORRCCS PC, R14, CFLAG ; Conditionally returns, setting C flag. 


; Above code‘should not be used except in User mode, since it will reset the interrupt enable flags to 
; their value when R14 was set up. This generally applies to non-user mode programming. 
; @g., MOVS PC,R14 MOVPC,R14 _ is safer! 


Apple Computer, inc. CONFIDENTIAL 48 


VA 
& Apple Computer, Inc. 


© APRM / VL2340 


Appendix 1 - Differences Between 
VL86C010 (ARM) And APRM 

The modifications made to the ARM in 
order to create the APRM predomi- 
nantly affect four instructions; load, load 
multiple, store, and store multiple. For 
the load and load multiple instructions 
both byte and word operations are 
modified. Only the word functions of 
the store and store multiple are altered. 


The APRM uses the "BigEndian” style 
byte addressing modes that are the 
same as the MC680x0 processor 
family. See Table 1 for examples. 


The APRM allows the user to combine 
segments from two aligned words of 
data into one nonaligned word oriented 
as shown in Table 1. The data is 
loaded via a nine step process which 
generates 2 complete memory ac- 
cesses (dbl access). The expected 
address is generated by the APRM 


(step 1) the user must notice that the 
address is nonaligned and freeze the 
clock (step 2). The user then provides 
the first word of data (step 3) and brings 
the NADR signal high (step 4). NADR 
will latch the appropriate bytes of data 
from the first word and cause the APRM 
to output the new address (step 5). 

This address is the first address 
incremented by four bytes. The user 
will provide the second word of data 
(step 6), deassert NADR (step 7) and 
restart the clock (step 8). The APRM 
will take the combination of these two 
words shifted internally by the proper 
amount and load them into the destina- 
tion register (step 9). See Figure 1 for a 
timing sequence of this procedure. 


Table 2 details the shift results for the 
various address combinations during 
store operations. External hardware 
must freeze the processor clock and 
enable the proper memory lanes for 


TABLE 1. SHIFTS FOR LOAD OPERATIONS 


Byte Address Value 


B/-W a cerns AO) 


1 


! ee 


} 90000044 ts” 


ee 9000033 
? 2 ts ooooa22 
es ee 


000 11 223344 11223344 
0 100 5566 55667788 
11223344 
0 55667788 44112233 


no support for 
dbi access 


nondestructive un-aligned word stores. 
NADR functions during double access 
stores of un-aligned word values to 
generate the next word aligned address 
for writing the second data segment. 


The APRWM also provides 32 address 
signals although the program area is 
still limited to the lower 26 bits. When- 
ever the APRM is performing an opcode 
fetch the upper six bits are forced low. 
Increasing the size of the address 
space made the address exception 
check unecessary as the old exception 
areas are now valid memory locations. 
An input is added, MSBLOW, that when 
asserted high causes the upper eight 
bits of the address bus to go low. 


A programmable page detector is also 
added. It can be programmed for 256, 
512, 1024, or 2048 word pages. 
Whenever the next address, if synchro- 
nous, would be the last word of the 
page a new signal called PGHIT would 
be asserted. See Table 3 for the 
decodes of the page inputs for the 
various page sizes. 


Data Bus Value oa L86C010 Shifter ae DO) APRM Shifter (D31- DO) 


0000001 1 
00000022 
00000033 


00000044 


11223344 no shift or 
55667788 dbi access 


with support 
22334455 for dbi access 


Fa 
0 55667788 33445566 

11223344 “7 
0 55667788 44556677 


TABLE 2. SHIFTS FOR STORE OPERATIONS 


Address Value Register Data 


00 11223344 


10 
11 


01 ee 


Output Data at pins 
11223344 
44112233 
33441122 
22334411 


TABLE 3. MEMORY PAGE 
SIZE 

Page (1,0) Page Size 

00 256 Words 

01 512 Words 

10 1024 Words 

11 2048 Words 
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FIGURE 1. NONALIGNED MEMORY CYCLES ( 


CLK Step 2 Step 8 
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PACKAGE OUTLINES 
100-PIN CERAMIC PIN GRID ARRAY 


100-PIN QUAD PLASTIC FLATPACK (QPFP) 
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